CA
Job Description
Job description
Role: Data Engineer
Responsibilities
- Work on the collecting, storing, processing, and analyzing of large sets of data
- Choose optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them
- Responsible for integrating those solutions with the architecture used across the company and to help build out some core services that power Machine Learning and analytics systems
Role Requirements
- Lead and work closely with all teams (including virtual teams based in non UK locations), creating a strong culture of transparency and collaboration
- Ability to process and rationalize structured data, message data and semi/unstructured data and ability to integrate multiple large data sources and databases into one system
- Proficient understanding of distributed computing principles and of the fundamental design principles behind a scalable application
- Strong knowledge of the Big Data eco system, experience with Hortonworks/Cloudera platforms
- Strong self-starter with strong technical skills who enjoys the challenge of delivering change within tight deadlines
- Knowledge of one or more of the following domains (including market data vendors):
- • Party/Client
- • Trade
- • Settlements
- • Payments
- • Instrument and pricing
- • Market and/or Credit Risk
- Practical expertise in developing applications and using querying tools on top of Hive, Spark (PySpark)
- Strong Scala skills.
- Experience in Python, particularly the Anaconda environment and Python based ML model deployment
- • Experience of Continuous Integration/Continuous Deployment (Jenkins/Hudson/Ansible)
- • Experience in working in Teams using the Agile Methods (SCRUM) and Confluence/JIRA
- • Good communication skills (written and spoken), ability to engage with different stakeholders and to synthesise different opinions and priorities
- • Knowledge of at least one Python web framework (preferably: Flask, Tornado, and/or twisted)
- • Basic understanding of front-end technologies, such as JavaScript, HTML5, and CSS3 would be a plus
- • Good understanding of global markets, markets macrostructure and macro economics
- • Knowledge of Elastic Search Stack (ELK)
- • Experience processing and rationalising structured data, message data and semi/unstructured data and integrating multiple large data sources and databases into one system
- • Knowledge of distributed computing principles and of the fundamental design principles behind a scalable application
- • Experience using:
- o Hortonworks/Cloudera platforms
- o HDFS
- o Querying tools on top of Hive, Spark (PySpark)
- Scala
- o Python, particularly the Anaconda environment
- o GIT/GITLAB as a version control system
- • Good communication skills (written and spoken), ability to engage with different stakeholders and to synthesise different opinions and priorities
- • Good knowledge of SDLC and formal Agile processes, a bias towards TDD and a willingness to test products as part of the delivery cycle
