UP

LLMOps Engineer

Uplers
Bangalore3-9 LPA Posted 9 May 2025
FULL TIME
Kubernetes
Problem Solving
Python

Job Description

Responsibilities

  • Deploy and scale LLM inference workloads on Kubernetes (K8s) with 99.9% uptime.
  • Build agentic tools and services for fraud investigations with complex reasoning capabilities.
  • Work with Platform Engineers to set up monitoring and observability (e.g., Prometheus, Grafana) to track model performance and system health.
  • Fine-tune open-source LLMs using TRL or similar libraries.
  • Use Terraform for infrastructure-as-code to support scalable ML deployments.
  • Contribute to Tech blogs, especially technical deep dives of the latest research in the field of reasoning.

Requirements

  • Strong programming skills (Python, etc.) and problem-solving abilities.
  • Hands-on experience with open-source LLM inference and serving frameworks such as vLLM.
  • Deep expertise in Kubernetes (K8s) for orchestrating LLM workloads.
  • Some familiarity with fine-tuning and deploying open-source LLMs using GRPO, TRL, or similar frameworks.
  • Deep expertise in Kubernetes (K8s) for orchestrating LLM workloads.
  • Familiarity with/Knowledge of high-availability systems.
Join WhatsApp Channel