APApex One
Data Engineering Lead
Bangalore ₹4-7 LPA Posted 13 Oct 2025
FULL TIME
Hive
Apache Spark
Pyspark
AWS Glue
Python
Job Description
Key Responsibilities:
Technical Leadership
- Lead and mentor a team of data engineers.
- Promote best practices in design, development, and delivery.
- Translate business requirements into robust data architectures.
Data Pipeline Development
- Develop and optimize ETL/ELT pipelines using Apache Spark, Hive, AWS Glue, and Trino.
- Handle both structured and unstructured data workflows.
- Build real-time and batch processing systems for analytics and ML.
Cloud & Infrastructure Management
- Design cloud-based data solutions with AWS services (S3, Athena, Redshift, EMR, etc.).
- Manage Hive Metastore for schema and metadata.
- Implement federated querying with Trino.
Performance Optimization
- Tune Apache Spark, Hive, and Trino workloads.
- Use indexing, caching, partitioning, and bucketing to accelerate queries.
- Monitor performance and implement continuous improvements.
Collaboration & Stakeholder Engagement
- Partner with data scientists, analysts, and business teams.
- Align data infrastructure with business and compliance needs.
Data Governance & Quality
- Define and enforce data quality and governance standards.
- Ensure compliance with security and privacy regulations.
Innovation & Continuous Learning
- Stay updated on emerging technologies and trends.
- Continuously improve architecture and processes.
