Qubrid AI
Website:
qubrid.com
Job details:
Read everything carefully. The requirements and screening questions are critical and if not answered correctly and satisfactorily will result in auto-rejection and waste of your time.
- Work from Home.
- This is a full-time role. If you plan to do 2 or more jobs at the same time or want to do this part-time, that won't work for us. In that case please do not apply as it will get auto-rejected
- Note - this job requires working late night India time until 4AM to overlap with USA working times. Do not apply if this timing doesn't work
- Salary depends on experience and current verifiable (paychecks) compensation.
- Junior candidates with 2 years experience are suitable
AI Engineer (Hands-On) — Multi-Agent AI Platform
About Qubrid AI
Qubrid AI is building next-generation AI infrastructure focused on inference, GPUs, multi-model orchestration, and scalable AI deployments. Our mission is simple: democratize access to AI infrastructure - from developers spending their first $5 to enterprise-scale AI deployments processing billions of inference requests. We are looking for a deeply technical AI Engineer who can design and build production-grade AI systems end-to-end, not just create architecture diagrams.
This role is for builders:
- You should be equally comfortable:
- writing production Python code
- optimizing inference pipelines
- working with open-source models
- building multi-agent systems
- designing scalable backend architectures
- deploying AI systems into production
- If you are primarily theoretical or management-focused, this role is probably not the right fit.
What You’ll Build
You will help develop a full-stack multi-agent AI SaaS platform including:
- Multi-agent orchestration systems
- AI inference pipelines
- Fine-tuning workflows
- RAG systems
- Tool-calling architectures
- Memory and context management systems
- Model routing and optimization layers
- Backend APIs and distributed systems
- GPU-aware inference infrastructure
- Enterprise-grade scalable deployments
- This is a highly hands-on engineering role where design and implementation go together.
Responsibilities
- AI Systems & Multi-Agent Software development
- Design and build production-grade multi-agent AI systems
- Develop orchestration frameworks for autonomous workflows
- Implement agent communication, memory, planning, and tool usage
- Build scalable RAG and retrieval pipelines
- Design long-context and multi-modal workflows
- Inference & Model Infrastructure
- Optimize inference pipelines for latency and throughput
- Work with open-source models including Llama, Qwen, Kimi, Mistral, DeepSeek, Gemma, Flux, SDXL, and other frontier/open models
Implement model serving infrastructure using technologies like:
- vLLM
- TensorRT-LLM
- TGI
- Ollama
- SGLang
- Ray Serve
- Build intelligent model routing and fallback systems
- Improve GPU utilization and inference efficiency
- Fine-Tuning & Model Optimization
- Build and manage fine-tuning pipelines
- Work with:
- LoRA / QLoRA
- PEFT
- RLHF/RLAIF concepts
- Quantization
- Distillation
- Evaluate models across latency, quality, and cost tradeoffs
Backend & Platform Engineering
- Develop scalable backend systems using Python
- Design APIs, microservices, async workflows, and distributed systems
- Build production-grade SaaS ssoftware
- Implement observability, logging, monitoring, and reliability systems
- Work with vector databases, caching systems, queues, and storage layers
- Deployment & Infrastructure
- Deploy AI systems on cloud and GPU infrastructure
- Work with Kubernetes, Docker, and scalable orchestration systems
- Build highly available inference infrastructure
- Optimize infrastructure costs and scalability
Requirements
General requirements
- 2 Years in AI engineering
- Strong hands-on Python expertise
- Proven experience building production AI systems
- Experience with LLM inference optimization
- Deep understanding of transformer architectures and modern LLM ecosystems
- Experience with open-source model deployment
- Strong backend engineering experience
- Experience designing scalable SaaS platforms
- Experience with APIs, async systems, and distributed architectures
- Strong debugging and systems-thinking ability
AI/ML Experience
- Multi-agent systems
- RAG architectures
- Fine-tuning pipelines
- Embeddings and vector databases
- Tool-calling frameworks
- Model evaluation and benchmarking
- Prompt orchestration and workflow systems
Infrastructure Experience
- Docker
- Kubernetes
- GPU infrastructure
- CI/CD pipelines
- Cloud platforms (AWS/GCP/Azure)
- Distributed inference systems
What We’re Looking For
We are specifically looking for engineers who:
- build things themselves
- move fast
- can go from idea to production
- understand both AI and systems engineering
- can design and implement
- are comfortable operating in ambiguity
- care about performance and scalability
- are obsessed with execution
You should be able to:
- write production code daily
- review system bottlenecks
- optimize inference performance
- debug distributed systems
- build MVPs rapidly
- scale products into production systems
- Bonus Points
- Experience building AI SaaS products from scratch
- Experience with agentic frameworks
- Experience with GPU optimization
- Contributions to open-source AI projects
- Experience with large-scale inference systems
- Startup experience
- Experience working with high-growth engineering teams
If you want to help shape the future of AI infrastructure and build systems that can scale from startup experimentation to enterprise deployments, we’d love to talk.
Click on Apply to know more.