Flag job

Report

Solutions Architect AI

Salary

₹20 - 40 LPA

Min Experience

1 years

Location

New Delhi, Delhi, India

JobType

full-time

About the job

Info This job is sourced from a job board

About the role

NVIDIA Partner Company (Micropoint Computers Pvt. Ltd ) is seeking a Generative AI Solution Architect with expertise in training Large Language Models (LLMs) and implementing Pretraining, Finetuning, and Retrieval-Augmented Generation (RAG) workflows. As part of our AI Solutions team, you'll design and deliver solutions using NVIDIA's generative AI technologies. This role requires proficiency in open-source LLMs and RAG workflows.

What You Will Be Doing
Architect generative AI solutions with a focus on LLM training, deployment, and RAG workflows.

Collaborate with customers to design tailored solutions and understand their language-related business challenges.

Support pre-sales activities with technical presentations and demonstrations of LLM and RAG capabilities.

Provide feedback to NVIDIA engineering teams and contribute to the evolution of generative AI software.

Engage with customers/partners to understand their requirements.

Lead workshops and design sessions for generative AI solutions focused on LLMs and RAG workflows.

Implement strategies for efficient LLM training and optimize Large Language Models using NVIDIA’s platforms.

Design RAG-based workflows to enhance content generation and information retrieval.

Integrate RAG workflows into customer applications and stay updated on language models and generative AI technologies.

Provide technical leadership on LLM training and RAG-based solutions.

What We Need To See

  • Bachelor's degree in Computer Science, Artificial Intelligence, or equivalent experience.
  • 1-4 years in a technical AI role, focusing on generative AI and training Large Language Models (LLMs).
  • Proven track record in deploying and optimizing LLM models in production.
  • Understanding of state-of-the-art multimodal models like Llama-3.1 and Llama-3.2(vision).
  • Expertise in training LLMs using frameworks like TensorFlow, PyTorch, or Hugging Face Transformers.
  • Proficiency in model deployment and optimization on various hardware platforms, focusing on GPUs.
  • Excellent communication and collaboration skills for technical and non-technical stakeholders.
  • Experience in leading workshops, training sessions, and presenting technical solutions.
  • Familiarity with containerization (e.g., Docker) and orchestration tools (e.g., Kubernetes) for model deployment.
  • Ways To Stand Out From The Crowd
  • Ability to optimize LLM models for inference speed, memory efficiency, and resource utilization.
  • Deep understanding of GPU cluster architecture and distributed computing concepts.
  • Experience with NVIDIA GPU technologies and GPU cluster management.
  • Ability to design scalable workflows for LLM training and inference on GPU clusters.

About the company

NVIDIA Partner Company (Micropoint Computers Pvt. Ltd)

Skills

tensorflow
pytorch
hugging face transformers
docker
kubernetes