GH

Azure Data Engineer

Ghrs Training
Pune4-9 LPA Posted 7 Jul 2025
FULL TIME
Apache Spark
Azure Databricks
Pyspark
Azure Synapse Analytics

Job Description

 Responsibilities:

  • Data Pipeline Design & Development: Design, build, and optimize robust and scalable ETL/ELT data pipelines using Azure Data Factory, Azure Synapse Analytics, Azure Databricks (Spark), and other Azure data services. Automate data ingestion, transformation, and loading processes from various sources (e.g., databases, APIs, flat files, streaming data).
  • Data Storage & Management: Select, design, and implement appropriate data storage solutions on Azure, including Azure Data Lake Storage (Gen2), Azure SQL Database, Azure Synapse Analytics (dedicated SQL pools), Azure Cosmos DB, and Azure Blob Storage, optimized for cost, performance, and scalability.
  • Data Modeling & Architecture: Design and implement data models (relational, dimensional, NoSQL) and data architectures that support analytical workloads, data warehousing, and reporting requirements.
  • Data Transformation & Processing: Develop complex data transformation logic using SQL, Python, PySpark, or Scala within Azure Databricks or Azure Synapse Analytics to cleanse, enrich, and prepare data for consumption by data analysts, data scientists, and business intelligence tools.
  • Performance Optimization & Monitoring: Monitor data pipeline performance, troubleshoot issues, and optimize data processing jobs for efficiency, cost-effectiveness, and reliability. Implement logging, monitoring, and alerting using Azure Monitor or similar tools.
  • Data Quality & Governance: Implement data validation, cleansing, and quality control procedures to ensure data accuracy, integrity, and reliability. Collaborate on data governance initiatives, including metadata management and data cataloging (e.g., Azure Purview).
  • Security & Compliance: Implement data security measures, including access controls (RBAC), encryption, data masking, and compliance with data privacy regulations (e.g., GDPR, HIPAA, local Indian regulations).
  • Collaboration & Communication: Work closely with data scientists, data analysts, business intelligence developers, and other stakeholders to understand data requirements and deliver effective data solutions. Translate complex technical concepts into clear, concise communication for diverse audiences.
  • Automation & DevOps: Implement CI/CD pipelines for data solutions using Azure DevOps to automate deployment and management processes.
  • Continuous Learning: Stay updated with the latest Azure data services, features, and industry best practices.

Join WhatsApp Channel