Roles & Responsibilities:

Design, develop, and maintain data solutions for data generation, collection, and processing
Be a key team member that assists in design and development of the data pipeline
Create data pipelines and ensure data quality by implementing ETL processes to migrate and deploy data across systems
Contribute to the design, development, and implementation of data pipelines, ETL/ELT processes, and data integration solutions
Collaborate with cross-functional teams to understand data requirements and design solutions that meet business needs
Develop and maintain data models, data dictionaries, and other documentation to ensure data accuracy and consistency
Implement data security and privacy measures to protect sensitive data
Leverage cloud platforms (AWS preferred) to build scalable and efficient data solutions
Collaborate and communicate effectively with product teams
Identify and resolve complex data-related challenges
Adhere to best practices for coding, testing, and designing reusable code/component
Explore new tools and technologies that will help to improve ETL platform performance
Participate in sprint planning meetings and provide estimations on technical implementation

What we expect of you

We are all different, yet we all use our unique contributions to serve patients.

Basic Qualifications:

Bachelor's degree and 0 to 3 years of Computer Science, IT or related field experience
Diploma and 4 to 7 years of Computer Science, IT or related field experience
Preferred Qualifications:
Functional Skills:

Must-Have Skills :

Hands on experience with big data technologies and platforms, such as Databricks, Apache Spark (PySpark, SparkSQL), AWS, Redshift, Snowflake, workflow orchestration, performance tuning on big data processing
Proficiency in data analysis tools (eg. SQL) and experience with data visualization tools.
Proficient in SQL for extracting, transforming, and analyzing complex datasets from relational data stores.
Experience with ETL tools such as Apache Spark, and various Python packages related to data processing, machine learning model development

Good-to-Have Skills:

Experience with data modeling, performance tuning on relational and graph databases ( e.g. Marklogic, Allegrograph, Stardog, RDF Triplestore).
Understanding of data modeling, data warehousing, and data integration concepts
Knowledge of Python/R, Databricks, SageMaker, cloud data platform
Experience with Software engineering best-practices, including but not limited to version control, infrastructure-as-code, CI/CD, and automated testing

Professional Certifications :

Associate Data Engineer