MAMastercard
Senior BizOps Engineer
Pune ₹5-10 LPA Posted 29 Apr 2025
FULL TIME
Incident Response
Root Cause Analysis
system performance
Capacity Planning
Automation
Job Description
Role Overview:
- Business Operations Site Reliability Engineer (SRE):
- The role of the Business Operations team is to act as the production readiness steward for Mastercard products.
- As a BizOps SRE, the primary responsibility is ensuring the stability and health of the platform.
- Foster developer run ownership and empower developers to build resilient products.
- Support developers during the application build phase with operational design, automation, capacity planning, and monitoring, ensuring fault-tolerant and scalable products.
- Create and enforce operational standards while fostering an agile and learning culture.
- Focus on triage and root cause analysis, understanding the business impact of products, and performing blameless post-mortems.
- Engage early in the development lifecycle to be proactive and manage production and change activities to maximize customer experience.
- Focus on risk management, compliance, and risk mitigation across all environments.
- Align product and customer-focused priorities with operational needs by providing continuous feedback throughout the lifecycle.
Mission:
- The mission is to ensure production readiness through close collaboration with developers to design, build, implement, and support technology services.
- Ensure operational criteria such as system availability, capacity, performance, monitoring, self-healing, and deployment automation are implemented throughout the delivery process.
- Lead the DevOps transformation at Mastercard through tooling and by advocating for change and standards across development, quality, release, and product organizations.
- Support daily operations with a hyper-focus on triage and root cause analysis, understanding business impacts and conducting blameless post-mortems.
- Shift left in the development process, becoming more proactive to maximize customer experience and increase the value of supported applications.
- Focus on streamlining and standardizing application-specific support activities and centralizing points of interaction for both internal and external partners.
- Communicate effectively with key stakeholders to align product and customer-focused priorities with operational needs.
Key Responsibilities:
- Operational Readiness Architect:
- Serve as the primary contact responsible for the overall health, performance, and capacity of applications.
- Support services before they go live by engaging in system design consulting, capacity planning, and launch reviews.
- Partner with development and product teams to establish monitoring and alerting strategies, ensuring zero downtime during deployment.
- Site Reliability Engineering (SRE):
- Ensure application scalability, performance, and resilience.
- Practice sustainable incident response and blameless post-mortems.
- Take a holistic approach to problem-solving and optimize recovery time.
- Automate data-driven alerts to proactively escalate issues and work with development teams to establish Service Level Objectives (SLOs) to improve reliability.
- DevOps/Automation:
- Address complex development, automation, and business process challenges.
- Engage in and improve the entire lifecycle of services, from inception and design to deployment, operation, and refinement.
- Support the CI/CD pipeline, ensuring smooth promotion of software into higher environments through validation and operational gating.
- Lead Mastercard in DevOps automation and best practices.
- Increase automation and tooling to reduce manual interventions and toil.
- ITSM Practices:
- Analyze ITSM activities of the platform and provide feedback to development teams on operational gaps or resiliency concerns.
Role Qualifications:
- Education and Experience:
- BS degree in Computer Science, a related technical field (e.g., physics, mathematics), or equivalent practical experience.
- Exposure to coding and/or scripting.
- An appetite for pushing the boundaries of automation and exploring new technology, infrastructure, and practices to scale architecture for future growth.
- Technical and Analytical Skills:
- Experience with algorithms, data structures, scripting, pipeline management, and software design.
- Systematic problem-solving approach with strong communication skills and a sense of ownership.
- Interest in designing, analyzing, and troubleshooting large-scale distributed systems.
- Comfortable collaborating with cross-functional teams to ensure expected system behavior is understood and monitoring is in place to detect anomalies.
- Additional Skills:
- Ability to balance doing things correctly with fixing issues quickly.
- Flexible and pragmatic, working towards the long-term health of systems.
- Willingness to learn and take on challenging opportunities while being part of a matrix-based, diverse, and geographically distributed team.
- Ability to prioritize and build relationships across development, operations, and product teams.
