PE
Job Description
What You'll Do:
- Design and Develop: Analytics workloads using Apache Spark and Scala for big data processing.
- Create and Optimize: Data transformation pipelines using Spark or Apache Flink.
- Migrate Workloads: From cloud platforms to open-source Apache Spark infrastructure on Kubernetes.
- Implement Optimization: Performance techniques for large-scale data processing
Expertise You'll Bring:
- Scala Programming: Focus on functional programming paradigm.
- Apache Spark: Extensive experience with core concepts and APIs, including:
- Spark SQL and DataFrame APIs
- Spark Structured Streaming
- Spark MLlib for analytics
- Distributed Computing: Strong understanding of big data processing frameworks.
- Data Modeling: Expertise in optimization techniques for large-scale datasets.
- Performance Tuning: Proficiency in optimizing Spark jobs.
- Lakehouse Storage: Good understanding of technologies like Delta Lake and Apache Iceberg
