PH

Site Reliability Engineer 2 - Hadoop

PhonePe
Bangalore5-9 LPA Posted 23 May 2025
FULL TIME
Microsoft Azure
Linux
Mysql
Python
Hadoop

Job Description

Key deliverables:

  1. Manage and administer Hadoop ecosystem components and performance tuning
  2. Automate operational workflows using Airflow, Ansible, or scripting languages
  3. Monitor and troubleshoot system-level and network-level issues across hybrid cloud environments
  4. Collaborate with security and platform teams to apply patches, upgrades, and enhance data infrastructure

Role responsibilities:

  1. Ensure high availability and reliability of Hadoop clusters and supporting tools
  2. Maintain infrastructure across AWS/Azure/GCP and optimize load balancing with Nginx
  3. Develop tools and monitoring solutions using Prometheus, Grafana, ELK
  4. Support on-call rotations, resolve production incidents, and drive root cause analysis

Join WhatsApp Channel