Secondary Skills:sql Skills highlighted with ‘‘ are preferred keyskills
Job Location:
Pune, Chennai
Posted Date:
389 days ago
Job Description
Your Responsibility
Support and optimize systems that process large volume of data; focusing on scalability, latency, efficiency, and fault-tolerance in every system that you manage
Demonstrate technical and ownership skills to go very deep or broad in solving classes of problems
Oversee components across big data platforms, and work with developers and engineering leaders to identify opportunities for system improvements leading to improved performance parameters in the systems
Ensure that we are measuring, reporting and improving our MTTD, MTTR, MTBF and other operational parameters
Role model for behaviors such as being data intensive while taking everyday decisions, approach everyday problems with scientific temperament and rigor, having customer-first mind-set, maintaining highest standards for operational excellence
Demonstrate the belief that you can achieve more on a team that the whole is greater than the sum of its parts and rely on others’ candid feedback for continuous improvement
Go beyond the written word – whether you’re working on an API used by other developers, an internal tool consumed by our pricing teams, or a feature used by millions of our clients, your attention to details leads to a delightful user experience
Your Qualifications:
Strong sense of ownership, focus on quality, team orientation, design thinking, responsiveness, efficiency and innovation
Adept at system monitoring, user request handling and incident management, with an ability to work with distributed teams in a collaborative and productive manner
2 to 6 years of experience in supporting large scale products, distributed systems in a high caliber environment
In-depth knowledge of UNIX and SQL; and coding or scripting knowledge in at least one language, preferably Java, Python, Golang or C++
Have previously supported big data-oriented systems such as Hadoop, Hive, Spark, Kafka and Flink
Experience in handling and triaging complex production issues
Prior experience with observability tools like Grafana, Prometheus, Splunk and Dynatrace
Prior exposure to Azure, GCP and NoSQL based cloud native databases
Experience with Site Reliability Engineering practices, including operational architectures, observability, reliability, availability and scalability