AI Infrastructure Engineer - LLMHitya Globalfull-timeRequired skillscachingCDNend-to-endload balancingRedisstate managementTensorFlowREST APIsAbout the role Hitya Global Website: hityaglobal.in Job details: Job DescriptionWhat You'll Own :AI Infrastructure Architecture Design and implement asynchronous multi-agent orchestration Own end-to-end latency from user message to AI response Build resilient inference pipelines that gracefully degrade under load Implement intelligent request routing and load balancing for AI workloads Migrate critical AI conversation flow from monolith to dedicated services Implement WebSocket/streaming infrastructure for real-time chat Design circuit breakers and fallback strategies for AI model failures Build comprehensive observability for AI system performance Optimize credit data retrieval and caching strategiesTechnical Requirements Experience ONLY LOOKING FOR CANDIDATES FROM TIER 1 INSTITUTES. ONLY LOOKING FOR CANDIDATES FROM B2C DOMAIN. 3 to 5 years building production systems handling >10k concurrent users Proven experience with async/event-driven architectures (not just REST APIs) Hands-on experience scaling ML/AI inference in production Deep understanding of caching strategies (Redis, in-memory, CDN) Experience with message queues and real-time communication protocolsAI-Specific Expertise Built systems integrating multiple LLM/AI models in production Experience with AI model serving frameworks (TensorFlow Serving, Triton, etc.) Understanding of AI inference optimization (batching, caching, model quantization) Knowledge of conversation state management and context handling Has debugged production issues under high AI inference loadGrowth Path Direct impact on customer subscription retention through performance Exposure to cutting-edge AI infrastructure challenges Ownership of technical decisions affecting revenue-generating conversations Path to leading AI platform team as you scaleInterview Process Technical Assessment Intro + AI/ML focused technical discussion (60 minutes) System Design Architecture and scaling conversations (60 minutes) Final Round Cultural alignment and team interaction(ref:hirist.tech) Click on Apply to know more. This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.