AP

Data Engineering Lead

Apex One
Bangalore4-7 LPA Posted 13 Oct 2025
FULL TIME
Hive
Apache Spark
Pyspark
AWS Glue
Python

Job Description

Key Responsibilities:

Technical Leadership

  • Lead and mentor a team of data engineers.
  • Promote best practices in design, development, and delivery.
  • Translate business requirements into robust data architectures.

Data Pipeline Development

  • Develop and optimize ETL/ELT pipelines using Apache Spark, Hive, AWS Glue, and Trino.
  • Handle both structured and unstructured data workflows.
  • Build real-time and batch processing systems for analytics and ML.

Cloud & Infrastructure Management

  • Design cloud-based data solutions with AWS services (S3, Athena, Redshift, EMR, etc.).
  • Manage Hive Metastore for schema and metadata.
  • Implement federated querying with Trino.

Performance Optimization

  • Tune Apache Spark, Hive, and Trino workloads.
  • Use indexing, caching, partitioning, and bucketing to accelerate queries.
  • Monitor performance and implement continuous improvements.

Collaboration & Stakeholder Engagement

  • Partner with data scientists, analysts, and business teams.
  • Align data infrastructure with business and compliance needs.

Data Governance & Quality

  • Define and enforce data quality and governance standards.
  • Ensure compliance with security and privacy regulations.

Innovation & Continuous Learning

  • Stay updated on emerging technologies and trends.
  • Continuously improve architecture and processes.

Join WhatsApp Channel