CA
Job Description
- Pyspark or Scala development and design.
- Experience using scheduling tools such as Airflow.
- Experience with most of the following technologies (Apache Hadoop, Pyspark, Apache Spark, YARN, Hive, Python, ETL frameworks, Map Reduce, SQL, RESTful services).
- Sound knowledge on working Unix/Linux Platform
- Hands-on experience building data pipelines using Hadoop components - Hive, Spark, Spark SQL.
- Experience with industry standard version control tools (Git, GitHub), automated deployment tools (Ansible & Jenkins) and requirement management in JIRA.
- Understanding of big data modelling techniques using relational and non-relational techniques
- Experience on debugging code issues and then publishing the highlighted differences to the development team.
Good to have Requirements
- Experience with Elastic search.
- Experience developing in Java APIs.
- Experience doing ingestions.
- Understanding or experience of Cloud design patterns
- Exposure to DevOps & Agile Project methodology such as Scrum and Kanban
