Jupiter
Website:
jupiter.money
Job details:
About the role
You have personally built and shipped an AI agent in production. Not architected it on a whiteboard. Not owned the roadmap for one. You wrote the code. You designed the prompts and the tool layer. You built the retrieval. You ran the eval harness. You shipped it to real users and you held the pager when it broke.
That is the qualifying condition for this role. Everything else in this document is for people who pass that bar. If you do not pass it, no amount of seniority, brand-name companies, or AI-adjacent engineering leadership makes up for it. We will know within ten minutes of an interview. Save us both the time.
If you do pass it, this is the role you have been waiting for.
The bet
Every fintech in India is shipping an AI feature. We are building the AI company.
Multiple production agents are already live at Jupiter, doing real work every day across the business, internal and customer-facing. They were built solo at the leadership level because the role we are now hiring did not yet exist in Indian fintech. It does now.
The thesis is straightforward and large. Every customer gets an agent advocate. Every employee gets an agent peer. The agents talk to each other through a shared trust and tool layer, composing into answers and actions no single agent could deliver. The competitive moat is not any one agent. It is the topology between them. The company that nails this owns the next decade of fintech. We intend to be that company.
The role
You will own the agent runtime, the agent platform, and the engineering-side agents.
You will not be a manager of managers. You will design how five, then fifteen, then fifty AI agents share retrieval, auth, evals, and the ability to call each other. You will spend roughly half your week writing code and reviewing PRs, and the other half deciding what not to build. You will treat each agent as a production system with SLOs, on-call rotations, deploys, and post-mortems.
You will own the engineering agents that change how Jupiter ships software. You get to build agent infrastructure from the substrate up at production scale, this is it.
The team you’ll work with Small. Young. Fast. Default-AI.
Most of the people you will work with on this are in their twenties or early thirties.. The defaults are “build it yourself, ship it tonight, fix it tomorrow.” Meetings are short. Documents are shorter.
The right hire treats this as a feature, not a compromise. You will be the most senior engineer in the room and you will not act like it. You will move at the pace they move, and you will earn their respect by writing code alongside them, not by holding architecture reviews.
If you have spent the last few years building scaffolding around yourself, you will struggle here. If the thought of working shoulder to shoulder with a 26-year-old who ships faster than you on half the things you used to be best at makes you nervous, this is not the role. If it sounds like the most fun you could be having, read on.
What makes this different from the same title anywhere else
It’s a fleet, not a feature. Most Director of Engineering (AI) roles today are shipping LLM features inside one product. You will own a topology. A customer agent grounded in a CS-knowledge agent, powered by a code-knowledge agent that ships its own PRs, all riding on a data-access agent. That shape does not exist anywhere else in Indian fintech yet.
You will build platform, not just product. The shared mesh, auth, retrieval, eval, and observability that lets us add a new agent in a week instead of a quarter. This is the highest-leverage engineering surface in the company.
Your work compounds. Every agent you ship makes the next agent cheaper. Every eval signal you industrialize makes every future model upgrade safer. Every PR your code-knowledge agent ships frees up an engineer to think about something only humans should think about.
It is regulated infrastructure, and that is a feature. Two RBI-regulated entities sit underneath the runtime. The moat that comes with doing it right is real.
What you’ll own
The customer-facing agent runtime. Sub-second response budgets on streaming token loops. Multi-model routing with graceful fallback when a provider degrades. Prompt caching that survives schema changes. Structured output enforcement, tool-call validation, and retry policies that do not double-spend tokens. Vector retrieval with hybrid search and reranking, tuned for Indian-language and financial-domain edge cases. All of this on async Python, Postgres with pgvector, Redis, and EKS, with first-class tracing across every LLM call, every tool invocation, and every retrieval hop.
The code-knowledge agent. Indexing of the engineering org. Code-aware chunking that respects symbol boundaries, dependency graphs, and recent commit history. The agentic loop that reads, plans, edits, runs tests, and opens a PR with the context a human reviewer actually needs. You will expand it from one allowlisted repo to a meaningful share of every PR the company ships.
The data-access agent layer. The substrate every other agent calls when it needs to know something true about the business. Safe SQL generation against governed schemas. Result-set reasoning. Query plan awareness. Caching that distinguishes “stale is fine” from “must be fresh.”
The shared agent platform. MCP-based tool exposure that makes adding a new capability a config change, not a deploy. A unified retrieval substrate. Eval pipelines that catch regressions before they ship. Shared auth, secrets, and consent boundaries that hold up under audit. Tracing that lets you debug a bad answer across three agent hops in under a minute. Adding the next agent should take a week, not a quarter.
Evals and safety. Eval infrastructure that goes well beyond LLM-as-judge. Golden sets for every agent skill, automated regression on every prompt change, prompt-injection red-teaming as part of CI, response-quality scoring tied back to user signals. Today this fires on every thumbs-down. You will turn it into the substrate every model upgrade rides on.
What we’re looking for, beyond the qualifying filter
Distributed-systems chops in a streaming, LLM-shaped world. Python, Postgres with pgvector, Redis, Kubernetes, async, tracing. You have debugged why a streaming response stalled at token 47, why a tool call dropped a parameter, why a retrieval result was technically correct and operationally wrong, why a secret did not sync, why a deploy lost an environment variable. This is not a notebook.
Agent-engineering taste. Real opinions on MCPs vs raw tool calls, classifiers vs regex, grounding vs reasoning, prompt caching vs context window stuffing, structured outputs vs free-form, planning loops vs constrained chains, evaluator agents vs deterministic checks, when a smaller faster model is the right answer and when it is a footgun. All formed from things that broke on you in prod.
Platform instinct. You design for the fifth agent when you are building the first. You see the auth pattern as the moat, not the inconvenience. You build the tracing before you build the feature, because you know what it costs not to.
Cost and latency fluency. You know the per-token cost of every model you reach for and the cache hit rate of every prompt you write. You have shipped the optimization that cut a per-query cost 40 percent without a quality regression. You treat inference cost as a first-class metric.
Operator-grade AI fluency. You and the Director of Product should be able to argue about a single line of system prompt at the same level of detail.
Builder mentality. You will write code in week one and you will keep writing code through year one.
Ambition that matches the bet. You see this as the chance to build agent infrastructure that the rest of the industry will copy. You are not here to add an LLM call to an existing service.
Comfort with a small, young, AI-first team. You see “the team is small” as freedom, not as understaffing. You ask “can an agent do this” before you ask “who can we hire to do this.” You take energy from engineers moving faster than you expect, not threat.
What you won’t do
Sit through six-hour roadmap meetings. Sign off on what your team builds without reading the diff. Treat AI as separate from the rest of engineering. Build process scaffolding to feel senior. Spend any time defending why AI matters. The bet is made.
What success looks like at six months
All agents share a common platform. Adding the next agent takes a week, not a quarter. The code-knowledge agent ships its own PRs at a measurable rate per week. Production incident rate on the customer-facing agent is below an agreed threshold, proven with telemetry you built. Per-query inference cost on the customer agent is down meaningfully against a day-one baseline. Leadership is no longer on-call for any agent. The team that was small and fast when you joined is still small and fast.
The basics
Experience: 8 to 10 years, no more. We are looking for engineers in the prime building years of their career, not veterans who have stopped touching the keyboard.
Compensation: ₹1Cr to ₹1.5Cr cash, plus an equity package sized to be material at exit. We pay at the top of the Indian fintech market for this role. If you are exceptional, we will go higher. We will be specific about valuation, strike, vesting, and the math in the conversation. We do not believe in “we will figure that part out later.”
Location: Bangalore, in person. This is not a remote role.
Reporting: Reports to company leadership for the first six months by design, transitioning to peer-of-executive as the handoff completes.
How to apply
Send us the agent you built. A repo, a live URL, a video of it doing something hard. The architecture diagram you drew on the back of a napkin and then made real. One paragraph on the worst thing that ever happened to it in prod and what you did. We will read this before we read your resume.
Click on Apply to know more.