Codefeast
Website:
codefeast.in
Job details:
Role Overview
We are looking for a SwarmBench Task Engineer specializing in planning and operations to design and build complex, multi-agent benchmark tasks that simulate real-world planning, scheduling, and operational decision-making scenarios. This role focuses on creating constraint-rich problems that evaluate multi-agent reasoning, decomposition, and optimization capabilities in realistic environments.
What does day-to-day life look like?
- Design and develop multi-agent benchmark tasks involving:
- Planning, scheduling, and resource allocation
- Operational decision-making (project management, logistics, incident response, capacity planning)
- Create constraint-rich problem statements with multiple interacting variables
- Develop verification scripts to evaluate:
- Feasibility (all constraints satisfied)
- Completeness (all requirements addressed)
- Optimality (efficient solutions)
- Build decomposition strategies:
- Split tasks across specialized sub-agents (resource-based, constraint-based, conflict resolution, optimization)
- Model real-world operational scenarios with dependencies, timelines, and resource constraints
- Collaborate on improving task quality, coverage, and evaluation rigor
Requirements
- 5+ years of experience in operations, project management, logistics, or supply chain
- Strong ability to formalize constraints, dependencies, and scheduling logic
- Proficiency in Python for building verification and validation scripts
- Strong structured problem-solving and decomposition skills
- Clear and precise technical writing skills
- Experience with AI coding benchmarks (e.g., SWE-bench, Terminal-bench)
- Hands-on experience with Docker (Dockerfiles, image builds, debugging)
Requirements
- Experience with optimization techniques (linear programming, constraint satisfaction, scheduling algorithms)
- Background in operations research
- Experience with simulation or modeling tools
- Knowledge of AI planning systems or automated reasoning
- Project management experience or certifications (PMP, Agile, etc.)
Click on Apply to know more.