BOBosch India
2024_MS_EDE3_XC_SRE_DataEngineering
Bangalore ₹5-6 LPA Posted 21 Mar 2025
FULL TIME
Devops
data engineering
Nosql
Azure
Job Description
Job Description
- As a Site Reliability Engineer (SRE), you will be responsible for ensuring the reliability, scalability, and performance of the systems necessary for the product and services for the Data Engineering Projects.
- You will work closely with function developers, Architects and DevOps teams to build and maintain high-availability systems, capable of handling high workloads automate with active monitoring of the infrastructure.
- As SRE you would ensure system reliability, availability for continuous deployment as part of the Agile practices in solution development.
Mandatory Skills & experience in:
- Experience with cloud platforms specifically Azure.
- Hands on experience and proficiency in Cloud infrastructure and CI/CD frameworks for providing IaC - Terraform, ARM, YAML and cloud native containerization & deployment of Services viz. Docker, k8s, etc
- Hands-on experience with large scale Azure DevOps and Azure PaaS components.
- Must have tool knowledge – Argo, Terraform (CLI), Azure-CLI, KubeCtl, Flux, Helm, Argo (Events and workflows), Istio, Grafana, Kustomize, YAML based coding and debugging skills
- Must have Kubernetes admin skill set, good to have knowledge about tools/extension to Kubernetes
- Experience in understanding of function development of data science solutions & programming languages e.g. Python, Go
- Excellent problem-solving skills and attention to detail.
- Hands-on experience with architecting and development of features using u-Service application principles
- Deep understanding of Service Level Objectives (SLOs), Service Level Indicators (SLIs), error budgeting and configuring KPIs for highly sophisticated services.
- Experience with the ELK stack (Elasticsearch, Logstash, Kibana) and Prometheus for monitoring and logging.
- Solid expertise in applying cloud security best practices through DevSecOps principles, with a deep understanding of Kubernetes (k8s) security.
Preferred Skills & experience in:
- Experience with DevOps, data pipelines and various messaging systems on a Cloud native setup (MS Azure)
- Experience with database technologies (MongoDB, NoSQL, etc.) and cloud native optimization services
- Strong working knowledge in Azure
- Motivating attitude, profound communication, strong interpersonal skills, structured and analytical
- Knowledge of costing, optimization techniques for large scale cloud native services. Key Responsibilities:
- System Reliability: Design and engineer highly scalable and high availability systems for high throughput workloads.
- Continuous monitoring & active alerting: Develop, deploy, and manage monitoring systems, setting up alerts to proactively identify and resolve issues.
- Automation: Automate routine tasks such as deployments, monitoring, and policy enforcements using suitable frameworks
- Performance Tuning: Optimize system performance by identifying bottlenecks and implementing appropriate solutions.
- Infrastructure as Code (IaC): Utilize tools like Terraform, Ansible, or similar to manage infrastructure through code, ensuring consistency and repeatability.
- Scaling & Cost Management: Analyze system performance and plan for future scaling needs.
- Issue Handling and resolution: Respond to system outages, perform root cause analysis, and implement fixes to prevent future incidents.
