PEPersistent
Data Engineer(Java+Spark)
Pune ₹4-9 LPA Posted 16 Apr 2025
FULL TIME
Devops
Cicd
cloud platform
Git Bash
Job Description
What You'll Do:
- Technical Design and implement client requirements.
- Build data pipeline.
- Perform transformations using Java+spark in data bricks.
- Create automated workflows with the help of triggers, Scheduled Jobs in Airflow.
- Design and development of Airflow dags to orchestrate the data processing jobs.
- Design and development for ensuring timely notifications through email alerts in the event of job failures or critical system issues.
- Developing and implementing of code to write the logic data.
- Direct customer interaction to understand & gather requirements.
- Provision of technical input to customer for analysis and design work.
- Support to customer queries.
Expertise You'll Bring:
- Proficiency in Java, with a good understanding of its ecosystems. Must have
- Experience in Data Engineering Java and Spark, Java is a must. Good knowledge of Spark Architecture
- Basic knowledge of Linux / Linux scripting is a must
- Transformation and aggregated data from multiple sources. Good Knowledge of Spark Architecture including Spark Core, Spark SQL, RDD, Data Set, and Data Frames. Good to have basic knowledge.
- Performance tuning using Optimization techniques, Caching Data in Memory, Broadcast etc. Good to have.
- Azure/Cloud DevOps concepts, CI/CD pipeline. Good to have.
- Good knowledge on the architecture of a Spark application. Good to have.
- Knows the concepts of deployment pipelines. Good to have.
- Hands-on experience with git bash. Must have
- Has worked with scrum methodology. Familiar with ceremonies and ways of working. Good to have.
- Able to communicate with the end clients via email with good email etiquette and via phone.
- Understanding business needs, identifying the right data sources, developing scalable and reliable data pipelines to ensure a smooth process
