AP

Data Engineering Lead

Bangalore ₹5-7 LPA Posted 13 Oct 2025

FULL TIME

Hive

Apache Spark

Pandas

Pyspark

Python

Job Description

Key Responsibilities:

Technical Leadership

Lead and mentor a team of data engineers, fostering best practices in coding, design, and delivery.
Drive adoption of modern data engineering frameworks, tools, and methodologies.
Translate complex business requirements into data pipelines, architectures, and workflows.

Data Pipeline Development

Architect, develop, and optimize scalable ETL/ELT pipelines using Apache Spark, Hive, AWS Glue, and Trino.
Handle complex data workflows involving structured and unstructured data.
Develop real-time and batch processing systems supporting BI, analytics, and ML applications.

Cloud & Infrastructure Management

Build and maintain cloud-based data solutions with AWS services like S3, Athena, Redshift, EMR, DynamoDB, and Lambda.
Design federated query capabilities using Trino.
Manage Hive Metastore for schema and metadata in data lakes.

Performance Optimization

Optimize Apache Spark jobs and Hive queries for performance and resource efficiency.
Implement caching and indexing strategies in Trino.
Monitor and tune system performance continuously.

Collaboration & Stakeholder Engagement

Work closely with data scientists, analysts, and business teams to deliver actionable insights.
Ensure data infrastructure aligns with organizational goals and compliance standards.

Data Governance & Quality

Establish and enforce data quality standards, governance practices, and monitoring.
Ensure data security, privacy, and regulatory compliance.

Innovation & Continuous Learning

Stay updated on industry trends and emerging data engineering technologies.
Identify and implement improvements in data architecture and processes.

Required Skills

Hive Apache Spark Pandas Pyspark Python

Join WhatsApp Channel