WA

Staff, Software Engineer

Walmart
Chennai11-20 LPA Posted 30 Apr 2026
FULL TIME
Kubernetes
System Design
Distributed Systems
Site Reliability Engineering
Cloud Computing

Job Description

• Design and develop scalable, reliable enterprise architectures

• Build automation tools to improve system reliability and reduce manual effort

• Drive system performance, latency, and availability improvements

• Participate in capacity planning, demand forecasting, and system tuning

• Troubleshoot large-scale production issues and ensure rapid resolution

• Define and track SRE metrics such as SLOs, SLIs, error budgets, and latency

• Implement monitoring, alerting, and observability frameworks

• Collaborate with infrastructure and business teams for operational excellence

• Lead and mentor engineering teams and foster a high-performance culture

• Ensure adherence to engineering best practices and continuous improvement

Join WhatsApp Channel