AMAmgen Inc
Senior High Performance Computing Engineer
Hyderabad ₹5-8 LPA Posted 8 May 2025
FULL TIME
Spark
Gcp
Azure
Aws
Hadoop
Job Description
Roles & Responsibilities:
- Implement, and manage cloud-based infrastructure that supports HPC environments that support data science (e.g. AI/ML workflows, Image Analysis).
- Collaborate with data scientists and ML engineers to deploy scalable machine learning models into production.
- Ensure the security, scalability, and reliability of HPC systems in the cloud.
- Optimize cloud resources for cost-effective and efficient use.
- Keep abreast of the latest in cloud services and industry standard processes.
- Provide technical leadership and guidance in cloud and HPC systems management.
- Develop and maintain CI/CD pipelines for deploying resources to multi-cloud environments.
- Monitor and fix cluster operations/applications and cloud environments.
- Document system design and operational procedures.
Basic Qualifications:
- Master's degree with a 4 - 6 years of experience in Computer Science, IT or related field with hands-on HPC administration OR
- Bachelor's degree with 6 - 8 years of experience in Computer Science, IT or related field with hands-on HPC administration OR
- Diploma with 10-12 years of experience in Computer Science, IT or related field with hands-on HPC administration
- Demonstrable experience in cloud computing (preferably AWS) and cloud architecture.
- Experience with containerization technologies (Singularity, Docker) and cloud-based HPC solutions.
- Experience with infrastructure-as-code (IaC) tools such as Terraform, CloudFormation, Packer, Ansible and Git.
- Expert with scripting (Python or Bash) and Linux/Unix system administration (preferably Red Hat or Ubuntu).
- Proficiency with job scheduling and resource management tools (SLURM, PBS, LSF, etc.).
- Knowledge of storage architectures and distributed file systems (Lustre, GPFS, Ceph).
- Understanding of networking architecture and security best practices.
Preferred Qualifications:
- Experience supporting research in healthcare life sciences.
- Experience with Kubernetes (EKS) and service mesh architectures.
- Knowledge of AWS Lambda and event-driven architectures.
- Exposure to multi-cloud environments (Azure, GCP).
- Familiarity with machine learning frameworks (TensorFlow, PyTorch) and data pipelines.
- Certifications in cloud architecture (AWS Certified Solutions Architect, Google Cloud Professional Cloud Architect, etc.).
- Experience in an Agile development environment.
- Prior work with distributed computing and big data technologies (Hadoop, Spark).
- Professional Certifications (please mention if the certification is preferred or mandatory for the role):
- Red Hat Certified Engineer (RHCE) or Linux Professional Institute Certification (LPIC)
- AWS Certified Solutions Architect – Associate or Professional
