VA

Data Scientist

Vayuz Technologies
Bangalore3-15 LPA Posted 25 Jul 2025
FULL TIME
Spark
Pyspark
Dataproc
Cloud
Data Scientist

Job Description

Responsibilities

  • Research, prototype, and implement recommendation models: two-tower, multi-tower, cross-encoder architectures
  • Utilize text/image embeddings (CLIP, ViT, BERT) for content-based retrieval and matching
  • Conduct semantic similarity analysis and deploy vector-based retrieval systems (FAISS, Qdrant, ScaNN)
  • Perform large-scale data prep and feature engineering with Spark/PySpark and Dataproc
  • Build ML pipelines using Vertex AI, Kubeflow, and orchestration on GKE
  • Evaluate models using recommender metrics (nDCG, Recall@K, HitRate, MAP) and offline frameworks
  • Drive model performance through A/B testing and real-time serving via Cloud Run or Vertex AI
  • Address cold-start challenges with metadata and multi-modal input
  • Collaborate with engineering for CI/CD, monitoring, and embedding lifecycle management
  • Stay current with trends in LLM-powered ranking, hybrid retrieval, and personalization

Required Skills

  • Python proficiency with pandas, polars, numpy, scikit-learn, TensorFlow, PyTorch, transformers
  • Hands-on experience with deep learning frameworks for recommender systems
  • Solid grounding in embedding retrieval strategies and approximate nearest neighbor search
  • GCP-native workflows: Vertex AI, Dataproc, Dataflow, Pub/Sub, Cloud Functions, Cloud Run
  • Strong foundation in semantic search, user modeling, and personalization techniques
  • Familiarity with MLOps best practicesCI/CD, infrastructure automation, monitoring
  • Experience deploying models in production using containerized environments and Kubernetes

Nice to Have

  • Ranking models knowledge: DLRM, XGBoost, LightGBM
  • Multi-modal retrieval experience (text + image + tabular features)
  • Exposure to LLM-powered personalization or hybrid recommendation systems
  • Understanding of real-time model updates and streaming ingestion

Join WhatsApp Channel