AP

Data Engineering Lead

Apex One
Bangalore5-7 LPA Posted 13 Oct 2025
FULL TIME
Hive
Apache Spark
Pandas
Pyspark
Python

Job Description

Key Responsibilities:

Technical Leadership

  • Lead and mentor a team of data engineers, fostering best practices in coding, design, and delivery.
  • Drive adoption of modern data engineering frameworks, tools, and methodologies.
  • Translate complex business requirements into data pipelines, architectures, and workflows.

Data Pipeline Development

  • Architect, develop, and optimize scalable ETL/ELT pipelines using Apache Spark, Hive, AWS Glue, and Trino.
  • Handle complex data workflows involving structured and unstructured data.
  • Develop real-time and batch processing systems supporting BI, analytics, and ML applications.

Cloud & Infrastructure Management

  • Build and maintain cloud-based data solutions with AWS services like S3, Athena, Redshift, EMR, DynamoDB, and Lambda.
  • Design federated query capabilities using Trino.
  • Manage Hive Metastore for schema and metadata in data lakes.

Performance Optimization

  • Optimize Apache Spark jobs and Hive queries for performance and resource efficiency.
  • Implement caching and indexing strategies in Trino.
  • Monitor and tune system performance continuously.

Collaboration & Stakeholder Engagement

  • Work closely with data scientists, analysts, and business teams to deliver actionable insights.
  • Ensure data infrastructure aligns with organizational goals and compliance standards.

Data Governance & Quality

  • Establish and enforce data quality standards, governance practices, and monitoring.
  • Ensure data security, privacy, and regulatory compliance.

Innovation & Continuous Learning

  • Stay updated on industry trends and emerging data engineering technologies.
  • Identify and implement improvements in data architecture and processes.

Join WhatsApp Channel