Website:
recrew.ai
Job details:
Role: Agentic QA & Simulation Engineer
Function: Quality Assurance / AI Engineering
Location: Bangalore
Type: Full-time
Industry: Artificial Intelligence, Commerce, Payments, Logistics, SaaS
About Company
The company is building a foundational agentic AI platform at India scale. It operates across commerce, payments, and logistics for SMBs, SMEs, and MSMEs.
The platform interprets intent through voice and context, autonomously matching demand with fulfilment. It eliminates operational complexity for millions of small businesses entering the digital economy.
Backed by large-scale infrastructure and distribution, this is a rare 0→1 national-scale platform build. The culture is lean, high-ownership, and focused on durable system design over short-term experimentation.
Position Overview
As an Agentic QA & Simulation Engineer, you will build simulation-driven quality systems that stress-test agentic AI across real-world and edge-case scenarios. This is not a traditional QA role — you will design synthetic environments, validation frameworks, and automated pipelines that ensure agent reliability and correctness at national scale. You will own the quality infrastructure underpinning a platform that serves millions of small businesses.
Role & Responsibilities
- Design and build scenario simulators for retail and operational environments to replicate real-world agent interactions at scale
- Create synthetic test environments that model diverse user intents, failure modes, and edge cases across commerce, payments, and logistics flows
- Architect and implement agent validation frameworks in Python to verify correctness, robustness, and behavioral consistency of LangGraph-based agentic workflows
- Build and maintain fully automated testing pipelines that reduce reliance on manual QA processes
- Apply chaos testing and fault-injection methodologies to surface failure scenarios before production deployment
- Integrate LLM evaluation harnesses to assess agent reasoning quality, output accuracy, and decision reliability
- Instrument observability into simulation environments to track agent behavior trends and regression patterns over time
Must Have Criteria
- 1 –3 years of experience in automation-heavy QA engineering with ownership of testing infrastructure
- Hands-on experience building testing frameworks from scratch in Python — not just maintaining existing ones
- Proficiency in Python for scripting, test automation, and framework development
- Experience with Playwright for end-to-end and API automation testing
- Demonstrated experience automating complex business or operational workflows end-to-end
- Exposure to distributed systems testing — understanding of failure modes, retries, and eventual consistency
Nice to Have
- Experience with LLM evaluation frameworks or harnesses (e.g., RAGAS, DeepEval, or custom eval pipelines)
- Hands-on experience with LangGraph or similar agentic workflow orchestration tools
- Prior work in chaos engineering or fault-injection testing (e.g., Chaos Monkey, Gremlin, Litmus)
- Experience building simulation or synthetic data environments for AI/ML system validation
- Background in e-commerce or AI-first startups where reliability and correctness are business-critical
What We Offer
- Opportunity to build foundational QA infrastructure for a national-scale agentic AI platform
- High ownership in a lean, 0→1 team with direct impact on platform reliability for millions of SMBs
- Work alongside deep technical builders defining agentic AI architecture from first principles
- Backed by large-scale infrastructure — startup intensity with enterprise-grade distribution and reach
- Competitive compensation with the chance to shape a once-in-a-generation platform build
Click on Apply to know more.