IN

Python, PySpark, ETL Developer

Infosys Limited
Hyderabad Posted 30 May 2026
FULL TIME
Etl
Pyspark
Big Data
Python

Job Description

Job Description:

  • Build and scale data solutions that power smarter decisions
  • In this role you ll work at the intersection of software engineering and data engineering using Python PySpark and ETL to transform raw complex datasets into reliable analytics ready assets
  • You ll collaborate closely with data engineers analysts and stakeholders to understand requirements design efficient pipelines and deliver high quality outputs on time
  • If you enjoy solving performance challenges improving data quality and creating maintainable code that runs in production this is a great opportunity to grow your impact
  • Expect a supportive collaborative environment where ownership is encouraged learning is continuous and your contributions directly improve how teams access and trust data

Key Responsibilities:

  • Data Pipeline Development
  • Develop and maintain scalable batch ETL pipelines using Python and PySpark for data ingestion transformation and loading
  • Implement reusable transformation logic ensuring pipelines are modular testable and easy to maintain
  • Optimize Spark jobs for performance partitioning caching joins shuffles and cost efficiency
  • Data Quality Reliability
  • Apply data validation checks handle schema evolution and ensure accuracy and completeness of processed datasets
  • Troubleshoot pipeline failures analyze logs and implement robust error handling and retry mechanisms
  • Monitor job runs and support operational stability through alerts runbooks and timely incident resolution
  • Collaboration Delivery
  • Work with cross functional teams to gather requirements define data mappings and deliver datasets aligned to business needs
  • Participate in code reviews follow engineering best practices and contribute to continuous improvement of standards and tooling
  • Document pipeline logic dependencies and operational procedures for smooth handovers and long term maintainability

Technical Requirements:

  • Technology Analytics Packages Python Big Data Technology Big Data Data Processing PySpark ETL

Additional Responsibilities:

  • Bachelor s degree in Computer Science Engineering Information Systems or a related field or equivalent practical experience
  • 2 5 years of hands on experience building data pipelines using Python and PySpark
  • Strong understanding of ETL concepts data transformations and handling large scale datasets
  • Proficiency in writing clean maintainable code and debugging production issues
  • Working knowledge of data structures algorithms and software development best practices

Preferred Skills:

Technology->Analytics - Packages->Python - Big Data,Technology->Big Data - Data Processing->PySpark
Join WhatsApp Channel