OR

Data Scientist 4

Oracle
Bangalore3-12 LPA Posted 24 Oct 2025
FULL TIME
Pytorch
Python

Job Description

  • Lead the design, development, and scaling of large, high-quality datasets to advance generative AI models in multimodal domains (e.g., text, vision, speech).
  • Define data standards and best practices for acquisition, cleaning, augmentation, annotation, and evaluation to ensure fairness, diversity, and representativeness.
  • Guide the integration of cutting-edge techniques (e.g., fine-tuning, RLHF, domain adaptation) into data generation and model alignment pipelines.
  • Provide technical leadership in building scalable, reliable data pipelines and synthetic data platforms for production environments.
  • Evaluate and operationalize research innovations, shaping how data preparation and generative AI methods transition into production-ready solutions.
  • Partner with research, engineering, and product leaders to define long-term data strategy and accelerate the adoption of generative AI solutions at scale.
  • Mentor and provide thought leadership to scientists and engineers, fostering a culture of data excellence and innovation

Qualifications and Skills:

  • Bachelors or Master s in Computer Science, Data Science, AI/ML, or related field with 6+ years of industry experience.
  • Proficiency in Python and solid foundation in applied ML methods.
  • Proficiency with Pytorch, Torchvision, OpenCV, and similar, as well as building and deploying DNN models in production.
  • Experience building large-scale data pipelines for acquisition, cleaning, augmentation, and validation.
  • Ability to evaluate datasets for distribution, diversity, anomalies and fairness to assess overall quality and suitability for generative AI.
  • Experience with Computer Vision, NLP, Transformers, Large Language Models, Generative AI, optimizations around LLM training and serving. Experience with Multimodal models a bonus.
  • Familiarity with advanced techniques (e.g., RLHF, domain adaptation, data augmentation) and their application in generative AI workflows.
  • Proven track record of delivering scalable, data-centric ML solutions.
  • Excellent communication and leadership skills, with experience mentoring junior scientists/engineers and presenting technical strategies to senior stakeholders.

Required Skills

Join WhatsApp Channel