Website:
deccanexperts.ai
Job details:
About Us:
Deccan AI experts is a pioneering company founded by IIT Bombay and IIM Ahmedabad alumni, with a strong founding team from IITs, NITs, and BITS. We specialize in delivering high-quality human-curated data, AI-first scaled operations services, and more. Based in SF and Hyderabad, we are a young, fast-moving team on a mission to build AI for Good, driving innovation and positive societal impact.
About the Role:
As a RAG & Agentic AI Evaluation Engineer, you will work on cutting-edge Retrieval-Augmented Generation (RAG) systems and Agentic AI workflows.
Your primary task is to evaluate agent behavior, annotate model reasoning, validate retrieval quality, and provide detailed feedback to improve AI decision-making and multi-step workflows. This role suits individuals with strong analytical thinking, experience with LLMs, and familiarity with multi-step agent systems or RAG pipelines.
Responsibilities:
- Annotate model responses, reasoning steps, tool usage, and agent actions
- Evaluate RAG output quality: relevance, accuracy, grounding, hallucinations
- Review agent workflows and end-to-end task execution
- Identify incorrect reasoning, missing retrievals, or flawed tool calls
- Provide structured feedback to improve agent behavior and RAG performance
- Validate retrieved documents, sources, and context relevance
- Review multi-hop reasoning chains for correctness and completeness
- Tag error types such as bias, hallucination, logical gaps, retrieval mismatch
- Follow detailed annotation guidelines to ensure consistent evaluations
- Document insights, issues, and improvement suggestions for AI research teams
- Collaborate with model developers to refine prompts, retrieval logic, and agent strategies
Skills & Experience Required:
- 1–4 years of experience working on RAG, or other agentic models.
- Understanding of LLMs, RAG pipelines, embeddings, and retrieval workflows
- Ability to judge correctness of model outputs and identify subtle issues
- Familiarity with agentic workflows (tool use, multi-step tasks, reasoning traces)
- Experience in documentation, evaluation, or quality review processes
- High attention to detail and ability to follow structured guidelines
- Experience with vector databases, prompt engineering, or LLM tools
- Knowledge of multi-hop reasoning, chain-of-thought, or tool invocation
- Prior experience in annotation or AI evaluation projects
Why join us?
- Competitive hourly pay: upto ₹1500 per hour.
- Fully remote and flexible work schedule.
- Opportunity to contribute to the advancement of AI technology.
NOTE: Pay will vary by project and typically is up to ₹1500 per hour. If you work an average of 3 hours every day, you could earn up to ₹90,000 per month once you clear our screening process.
Click on Apply to know more.