Persistent Systems
Website:
persistent.com
Job details:
Role Overview
We are looking for a
GenAI/LLM Engineer with strong
Python engineering skills and proven experience building
production-grade Retrieval-Augmented Generation (RAG) systems using
LlamaIndex and/or LangChain, and integrating
vector databases (Pinecone preferred). The role will focus on designing scalable RAG pipelines, implementing advanced retrieval strategies, and building
MCP/tool-calling connectors to expose enterprise APIs as agent tools for read/write operations.
Key Responsibilities - RAG Pipeline Design & Development
- Design and develop end-to-end RAG pipelines, including:
- Data ingestion
- Document parsing and preprocessing
- Chunking strategies
- Embedding generation
- Indexing into Pinecone (preferred)
- Retrieval and response generation
- Build production-ready semantic retrieval solutions and continuously improve relevance/grounding quality.
- Implement and optimize advanced retrieval strategies, including semantic search and retrieval tuning.
- Agent Tooling & MCP Integrations
- Build and integrate MCP connectors to expose internal/external system APIs as agent-callable tools (read/write).
- Contribute to agent orchestration patterns including:
- Intent routing (e.g., deciding between RAG vs MCP vs workflow)
- Tool selection and execution sequencing
- Agent reliability patterns (fallbacks, retries, observability)
- Security, Reliability & Performance
- Apply security controls and handle authentication/authorization tokens, ensuring safe access to enterprise systems.
- Optimize AI/ML workflows for performance, scalability, and reliability (latency, throughput, cost, robustness).
- Ensure seamless deployment and integration across environments in collaboration with platform/DevOps teams.
- Cross-functional Collaboration
- Work closely with product, backend, data engineering, and platform teams to ensure successful integration and delivery.
- Contribute to design discussions, technical documentation, and best practices for GenAI application engineering.
Required Skills & Qualifications
- 5–8 years of experience in software engineering and/or data engineering.
- 2+ years of hands-on experience building LLM/GenAI applications.
- Strong programming expertise in Python.
- Proven production experience with LlamaIndex and/or LangChain, especially for RAG systems.
- Hands-on experience with vector databases; Pinecone preferred.
- Strong understanding of retrieval concepts, embeddings, indexing, and semantic search.
Preferred / Good-to-Have
- Knowledge of MCP/tool-calling patterns; FASTMCP experience is a strong plus.
- Experience with agent frameworks, tool routing, and workflow orchestration.
- Familiarity with observability for GenAI apps (logging, tracing, evaluation, prompt/versioning).
What Success Looks Like (KPIs/Outcomes)
- High-quality RAG pipeline delivering accurate, grounded responses with measurable improvements in relevance.
- Reliable MCP connectors enabling safe tool-based automation across enterprise systems.
- Reduced latency and improved scalability with robust security and token management.
- Smooth integration and deployment through strong collaboration and engineering discipline.
GenAi, Azure Ai, OpenAI, Python, LLM
Click on Apply to know more.