• Bachelor's degree in Science/IT/Computing
• Experience productionizing AI/ML or LLM-powered workflows (using LangGraph, LangChain, CrewAI etc.) with a focus on reliability, reproducibility, and auditability
• Strong understanding of LLMOps fundamentals, including prompt/model/config versioning and traceable run metadata.
• Hands-on Data Engineering/Software Development using Python, Pyspark/Spark, NO-SQL (MongoDB), SQL- PostgreSQL, Redis, Databricks, Delta Lake, Azure Cloud
• Ability to build and maintain an LLM evaluation framework (golden datasets, regression tests, scoring/rubrics, quality thresholds, trend tracking) for non-deterministic outputs
• Proven ability to implement observability for LLM pipelines, including structured logging, metrics, dashboards, alerting, latency breakdown, error taxonomy, and token/cost tracking
• Experience designing and integrating tool-calling / agent skills (function calling, tool interfaces, input/output schemas, guardrails, structured outputs) into data pipelines or services
• Experience with API reliability patterns relevant to model calls (rate limiting, retries/backoff, circuit breakers, timeouts, idempotency, replay/backfill)
• Demonstrated curiosity and self-learning ability in fast-evolving GenAI industry; able to iterate quickly and then harden solutions for production