Cognizant
Website:
cognizant.com
Job details:
Job Summary
We are seeking technical and experienced AI Agentic Solutions Architects to join our advanced technology group. This role is dedicated to the hands-on design, development, and deployment of sophisticated Agentic AI solutions. You will be responsible for executing complex Proof of Concepts (PoCs) that demonstrate cutting-edge AI capabilities for our key customers. The role requires building and optimizing AI agents on high-performance Nvidia-based GPU infrastructure (including A100 and H100 systems) and leveraging Cognizant's proprietary ATK platform. The ideal candidate is a hands-on architect with a strong background in MLOps and extensive experience with modern agentic frameworks, advanced data semantics, and the Nvidia AI ecosystem.
Key Responsibilities
- Architect, build, and optimize multi-agent AI solutions using frameworks like LangChain, LlamaIndex, and AutoGen.
- Design and implement complex data ingestion and processing pipelines, ensuring robust data semantics for Retrieval-Augmented Generation (RAG) architectures.
- Develop and fine-tune Large Language Models (LLMs) and other foundational models using the Nvidia NeMo framework.
- Deploy and manage high-throughput, low-latency model inference services using Nvidia Triton Inference Server.
- Conduct performance profiling and optimization of AI workloads on Nvidia A100 and H100 Tensor Core GPUs.
- Integrate and manage specialized vector databases such as Milvus, Pinecone, and Weaviate for high-dimensional data indexing and search.
- Leverage Cognizant's ATK platform to orchestrate complex agentic workflows and ensure seamless integration with enterprise systems.
- Collaborate with infrastructure teams to ensure optimal configuration of Kubernetes clusters for GPU-accelerated workloads.
Required Skills And Experience
- Agentic Frameworks: Minimum 2+ years of hands-on experience with LangChain and/or LlamaIndex. Demonstrable ability to build complex chains, tools, and autonomous agents.
- Vector Databases: Proven expertise in deploying and managing at least one of the following: Milvus, Pinecone, Weaviate, or ChromaDB. Deep understanding of embedding models, indexing strategies (e.g., HNSW, IVF), and semantic search.
- Nvidia Software Stack:
- Nvidia Triton Inference Server: Hands-on experience deploying and scaling models for production.
- Nvidia NeMo: Experience in training or fine-tuning models.
- CUDA/cuDNN: Strong understanding of the CUDA programming model and library ecosystem for GPU acceleration.
- Nvidia Hardware Stack: Demonstrable experience working with and optimizing for Nvidia A100 or H100 GPUs. Familiarity with concepts like Multi-Instance GPU (MIG) is highly desirable.
- ML Frameworks & MLOps: Expert-level proficiency in Python and PyTorch. Strong experience with MLOps principles and tools, including containerization with Docker and orchestration with Kubernetes.
- Problem-Solving: Advanced analytical and debugging skills for complex, distributed AI systems.
Preferred Qualifications
- Experience with advanced RAG techniques, including re-ranking and query transformation.
- Familiarity with Nvidia's full enterprise software suite.
- Experience in a customer-facing role, delivering technically complex PoCs or solutions.
- Active contributions to open-source projects in the AI/ML ecosystem.
Click on Apply to know more.