CA

Data Engineer

Capco
Bangalore8-13 LPA Posted 11 Apr 2025
FULL TIME
Spark
Data Engineer

Job Description

Job description

Role: Data Engineer

Responsibilities

  • Work on the collecting, storing, processing, and analyzing of large sets of data
  • Choose optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them
  • Responsible for integrating those solutions with the architecture used across the company and to help build out some core services that power Machine Learning and analytics systems

Role Requirements

  • Lead and work closely with all teams (including virtual teams based in non UK locations), creating a strong culture of transparency and collaboration
  • Ability to process and rationalize structured data, message data and semi/unstructured data and ability to integrate multiple large data sources and databases into one system
  • Proficient understanding of distributed computing principles and of the fundamental design principles behind a scalable application
  • Strong knowledge of the Big Data eco system, experience with Hortonworks/Cloudera platforms
  • Strong self-starter with strong technical skills who enjoys the challenge of delivering change within tight deadlines
  • Knowledge of one or more of the following domains (including market data vendors):
  • • Party/Client
  • • Trade
  • • Settlements
  • • Payments
  • • Instrument and pricing
  • • Market and/or Credit Risk
  • Practical expertise in developing applications and using querying tools on top of Hive, Spark (PySpark)
  • Strong Scala skills.
  • Experience in Python, particularly the Anaconda environment and Python based ML model deployment
  • • Experience of Continuous Integration/Continuous Deployment (Jenkins/Hudson/Ansible)
  • • Experience in working in Teams using the Agile Methods (SCRUM) and Confluence/JIRA
  • • Good communication skills (written and spoken), ability to engage with different stakeholders and to synthesise different opinions and priorities
  • • Knowledge of at least one Python web framework (preferably: Flask, Tornado, and/or twisted)
  • • Basic understanding of front-end technologies, such as JavaScript, HTML5, and CSS3 would be a plus
  • • Good understanding of global markets, markets macrostructure and macro economics
  • • Knowledge of Elastic Search Stack (ELK)
  • • Experience processing and rationalising structured data, message data and semi/unstructured data and integrating multiple large data sources and databases into one system
  • • Knowledge of distributed computing principles and of the fundamental design principles behind a scalable application
  • • Experience using:
  • o Hortonworks/Cloudera platforms
  • o HDFS
  • o Querying tools on top of Hive, Spark (PySpark)
  • Scala
  • o Python, particularly the Anaconda environment
  • o GIT/GITLAB as a version control system
  • • Good communication skills (written and spoken), ability to engage with different stakeholders and to synthesise different opinions and priorities
  • • Good knowledge of SDLC and formal Agile processes, a bias towards TDD and a willingness to test products as part of the delivery cycle

Required Skills

Join WhatsApp Channel