oneinfer.ai
Report
Location
Greater Bengaluru Area
JobType
full-time
About the job
This job is sourced from a job board
Bengaluru / Remote · 0.5–2 Years Experience · Internship → Full-time
About OneInfer.ai
We build AI inference infrastructure, a unified API layer across LLMs, vision, and video models with smart routing, serverless GPU compute, and custom kernel optimisation (Triton/CUDA). Our goal: let teams focus on building products, not managing infrastructure.
Ownership
Not a title. Not a perk. It’s how we work.
You take responsibility for your work you spot problems, fix them, and see things through without being asked. You treat what you build as if your name is on it. Because it is.
What You’ll Do
• Integrate LLMs, vision, and multi-modal models via OpenAI-compatible APIs
• Build and optimise AI inference pipelines and backend services
• Contribute to developer-facing APIs, SDKs, and observability tooling
• Experiment with prompt engineering, agent workflows, and model routing
• Assist with GPU inference optimisation and benchmarking
• Participate in architecture decisions alongside founders and core engineers
What We’re Looking ForMust Have
• Strong Python, Node.js, or TypeScript skills
• Solid understanding of REST APIs and backend development
• Familiarity with LLMs (OpenAI, Claude, or similar)
• Good debugging and problem-solving instincts
Good to Have
• Exposure to GPU programming, CUDA, or Triton
What You’ll Gain
• Hands-on experience shipping production AI systems
• Exposure to multi-model orchestration and real-world scaling
• Direct mentorship from founders
• A clear path to a full-time role
How to Apply
Send us your resume, GitHub profile, and a note on any AI or backend projects you’ve worked on. Dm your CV or drop your CV to hireme@oneinfer.ai
We look forward to building with you.