Entelligence.AI
Website:
entelligence.ai
Job details:
Company Description
Entelligence's mission is nothing short of building the intelligence layer for engineering: the platform that takes ownership of code when no human can. We're starting by giving engineering leaders one place to see how their org actually ships, and ranking #1 against every leading AI reviewer. If you're into high-quality engineering systems and next-level infrastructure at a large scale, read on!
We're hiring a Research Engineer (US – India) to join our small team and push the frontier of what AI code review can do. Apply here: https://binary.so/XTqAIAv
Why Join
As a small team, we work in a highly collaborative environment and you'll have the opportunity to participate in every part of the business from idea to production.
Impact: Build the foundation and shape engineering practices, team, and company culture.
Excellence: Practice your craft with other ICs in a well-organized, fast-paced environment.
Ownership: Influence the direction of product and strategy; we care about your opinions.
What You'll Do
We're looking for a research engineer who enjoys living between research and production — someone who reads papers on Friday and ships something based on them by Tuesday. You'll own the models, evals, and reasoning systems that make Ellie the #1 AI code reviewer in the world and keep her there.
- Own the adversarial review loop. Ellie generates a finding, then actively tries to disprove it before it reaches the user — that loop is why our false-positive rate is roughly half the field's. You'll improve it: better hypothesis generation, sharper disconfirming-evidence search, smarter stopping criteria, and tighter grounding in the semantic graph. Every percentage point of precision you win ships to every customer.
- Run the eval flywheel. We maintain a public PR-review benchmark and a human-in-the-loop leaderboard, so improvements are measurable, not vibes. You'll grow the eval set, design new metrics (precision, recall, F1, time-to-merge impact, false-positive cost), and run head-to-head comparisons across GPT, Claude, Gemini, DeepSeek, and open-weights models. When a new model drops, you decide if and how it ships.
- Build the semantic understanding layer. Beyond LLMs, our edge is a semantic graph of each customer's codebase — control flow, data flow, call graphs, type info — that grounds every finding in real evidence. You'll improve graph construction across languages, design retrieval strategies that beat naive RAG, and feed the right context into the right model at the right cost.
- Close the post-mortem learning loop. When a customer writes a post-mortem, the same bug should never ship again on their codebase. You'll design the systems that turn incidents into durable detectors — pattern extraction, fine-tuning pipelines, customer-specific rule synthesis, and continual evaluation that none of it regresses.
- Push prompts, fine-tunes, and agents into production responsibly. Prompt engineering, supervised fine-tuning, RLHF/RLAIF, distillation, structured output, tool use, agent orchestration — pick the right tool for the job, ship it behind evals and feature flags, and watch real metrics move. We care about working systems, not method purity.
- Make the economics work. Code review at scale means hard latency and cost budgets. You'll own model routing, caching, partial recomputation on diffs, speculative decoding, and the line between what runs locally in the IDE vs. on our backend.
- Self-direct your research. You're a technical-founder type with strong taste for what's worth working on. You'll set the research agenda alongside the team, publish what's worth publishing (we already open-source our review benchmark), and turn ideas into shipped product.
What We are looking for
You're a senior IC who has built such systems before and this is not an area you have to ramp up on. We don't require any formal qualifications but value learning new skills, especially from one another. We are looking for someone who feels a sense of duty to the users of their work.
- Highly productive while producing quality code. You enjoy pushing out features in a pragmatic and maintainable way. You know when to use duct tape and when to lay a foundation.
- Design sensibility. While you'll co-craft the interface with top designers and frontend engineers, we expect you to have a knack for great UX, such that you feel if something is off and can flag it, or better yet, polish it.
- Attention to detail while pragmatic. We strive for few slips in code, Git hygiene, and clear written communication — all while remaining low-ego and simply focusing on solutions.
- Interested in productivity apps/systems. You might use your calendar to time block, try out new tools, or self-optimize in other ways.
- Good heart. We don't tolerate jerks and are generally just friendly people.
How to Apply
Interested in learning more about what it's like to build Entelligence.ai with us? We'd love to talk!
Apply here: https://binary.so/XTqAIAv
Diversity and inclusion are core to our culture. If you are a member of an underrepresented group in tech, we strongly encourage you to apply.
Is this role not the right fit? If you resonate with our mission and think your profile would be a great fit, send an email to info@entelligence.ai and pass along any information you believe is relevant.
Click on Apply to know more.