CO

AWS + Pyspark

Chennai ₹2-3 LPA Posted 21 Nov 2025

FULL TIME

glue

Pyspark

Amazon Redshift

Dynamo Db

Cloud Services

Data Pipeline Development: Design, implement, and maintain scalable data pipelines using AWS services such as S3, Glue, Lambda, EMR, and Kinesis.
Data Processing and Transformation: Utilize PySpark, Spark, and SQL to perform complex data transformations and aggregations on large datasets within the AWS ecosystem.
Data Storage and Management: Design and implement data storage solutions using Amazon S3, RDS, DynamoDB, and Redshift, ensuring data quality, integrity, and accessibility.
Data Modeling and Warehousing: Develop and maintain data models to support analytics and reporting, leveraging Redshift as the data warehousing solution.
Infrastructure and Cloud Technologies: Provision and manage scalable data infrastructure using EC2, VPC, IAM, CloudFormation, and other relevant AWS services.
Performance Optimization and Monitoring: Continuously monitor and optimize data pipelines and systems using CloudWatch, identifying and resolving performance bottlenecks.
Collaboration and Knowledge Sharing: Work closely with data scientists, analysts, and other stakeholders to understand data requirements and provide technical guidance.

Qualifications :

Proficiency in Python, SQL, PySpark, and Spark.
Strong expertise in AWS data services (S3, Glue, Lambda, EMR, Redshift, DynamoDB, RDS).
Experience with data warehousing, ETL processes, and data modeling.
Excellent problem-solving, analytical, and communication skills.