JU
Job Description
Job description
- Knowledge and data will be central to this journey of creating a proactive and predictive support Experience The use of automation, AI and other modern technology will enable reduction of time taken to resolve issues or perform tasks
- This role is part of Junipers strategic future of support team and will involve design of automated solutions
- The role requires enabling business transformation projects using technology, review of data to enable process, systems and tool re-engineering in customer support and services
- In addition, the role requires to support the enhancement of self-service, automation, omnichannel strategy by seeking solutions and drivers to achieve seamless customer experiences and increased customer loyalty
- Primary Tech skills:
- AWS , Databricks , Python, Pyspark, SQL, Web-Crawling
- Secondary Tech skills:
- Snowflake, MLOps
Responsibilities:
- Develop, maintain, and optimize data pipelines and workflows using Databricks Job workflows and Feature Store to ensure seamless data ingestion and transformation as a scalable data solutions
- Architect Lakehouse Solutions: Design, implement, and architect Lakehouse solutions on Databricks, leveraging Delta Lake, and Feature Store to enhance data storage and processing
- In-depth Databricks Expertise: Demonstrate a deep understanding of Databricks platform features, including Spark SQL, Delta Lake, Feature Store and Databricks notebooks, to optimise data engineering processes
- Data Transformation: Implement advanced data transformations and quality checks within the Lakehouse architecture to ensure data accuracy, completeness, and consistency
- Data Integration: Seamlessly integrate data from diverse sources, aligning with Lakehouse principles for data ingestion and storage, leveraging AWS S3 Storage and possibly Snowflake as a SQL Data Warehouse
- Data Security: Implement and maintain comprehensive data security and access controls within the Lakehouse architecture, utilising AWS IAM policies as security features to safeguard sensitive data
- Performance Optimisation: Architect data pipelines for performance and scalability, making efficient use of Databricks clusters and AWS storage resources through S3 Life cycle
- Data Modelling: Create and implement advanced data models and schemas on Databricks, aligning with Lakehouse principles for analytical and reporting needs
- Documentation: Create detailed documentation for data pipelines, data models, Lakehouse architecture configurations, AWS integrations, and data migration plans
- Troubleshooting: Proactively identify and resolve complex data pipeline and architecture issues, ensuring data integrity and availability, with a focus on Databricks/AWS monitoring and diagnostics
- Performance Monitoring: Employ advanced monitoring techniques and tools to maintain the performance and health of the Lakehouse architecture, taking proactive measures to address potential bottlenecks or issues
- Should also have understanding of cluster monitoring and sizing
- Data Governance: Ensure strict adherence to data governance and data management best practices within the Lakehouse architecture, utilising AWS-based data governance solutions
- Qualification and Desired Experiences:
- 7+ years of data analysis and engineering Experience Bachelorsdegree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field
- Experience with big data tools: Hadoop, Spark, Kafka, Spark Kafka Streaming, Python
- Familiarity with Snowflake environment
- Advanced working SQL experience working with relational databases, query authoring (SQL) and working familiarity with a variety of databases
- Hands-on experience in Databricks for data pipeline development and AWS services like S3, Glue, EMR, EC2
- Experience building 'big data' data pipelines, architectures and data sets
- In-depth knowledge of Model and Design of DB schemas for read and write performance
- Working knowledge of API or Stream-based data extraction processes like Salesforce API and Bulk API and have hands-on experience in web crawling
- Experience performing root cause analysis on all data and processes to answer specific questions and identify opportunities for improvement
- Build processes supporting data transformation, data structures, metadata, dependency and workload management
- A successful history of manipulating, processing and extracting value from large disconnected datasets
Personal Skills:
- Ability to collaborate cross-functionally in a fast-paced environment and build sound working relationships within all levels of the organization
- Ability to handle sensitive information with keen attention to detail and accuracy
- Passion for data handling ethics
- Ability to solve complex, technical problems with creative solutions while anticipating stakeholder needs and providing assistance to meet or exceed expectations
- Able to demonstrate perseverance and resilience to overcome obstacles when presented with a complex problem
- Assist in combining large data sets and data analysis to create optimization strategies comfortable with ambiguity and uncertainty of change when assessing needs for stakeholders
- Have effective time management skills which enable you to work successfully across functions in a dynamic and solution-oriented environment while meeting deadlines
- Self-motivated and innovative; confident when working independently, but an excellent team player with a growth-oriented personality
- Will be required to routinely or customarily troubleshoot items related to applications that require independent judgement, decision-making, and unique approaches
