Senior Robotics Engineer (5+ yrs)

Awign

full-time

Required skills

Python
C++
computer vision
end-to-end
Fusion
Pytorch

About the role

Awign

Website: awign.com
Job details:
About This Job

Awign

Location: Bengaluru, Karnataka, India

Work Mode: On-site

Industry: Robotics Engineering

Job Description

We’re looking for a Senior Robotics Perception Engineer to build end-to-end spatial perception systems that combine multi-camera vision, IMU data, and learning-based models into a unified 3D understanding of the world.

You’ll work on problems spanning SLAM, 6DoF pose estimation, multi-device sensor fusion, and calibration, while also leveraging modern computer vision and ML techniques (e.g., monocular depth, action/skill understanding, VLM).

This role sits at the intersection of robotics, 3D computer vision, and applied ML.

What You’ll Work On

Build real-time perception pipelines combining:Multi-camera systems (head-mounted + wrist-mounted cameras)IMU + RGB fusion for accurate camera pose estimationDevelop and optimize SLAM / visual-inertial odometry (VIO) systemsDesign multi-device sensor fusion to align multiple viewpoints into a single sceneImplement 3D / 6DoF hand and object pose estimation from RGB / RGB-D inputsImplement object detection modelsWork on stereo + multi-view geometry pipelinesBuild robust camera calibration systems:Intrinsics / extrinsicsCross-device calibrationIntegrate or research ML models for:Monocular depth estimationAction / skill labelingVLM systemsOptimize pipelines for real-time performance and robustness

What We’re Looking For

Core Skills (Must-Have)Strong foundation in 3D Computer Vision & GeometryMulti-view geometry, epipolar geometry, transformationsExperience with SLAM / VIO / sensor fusionVisual SLAM, visual-inertial fusion, state estimationHands-on experience with camera calibrationIntrinsics, extrinsics, stereo calibrationExperience working with multi-camera systemsStrong programming skills in C++ and/or Python

Good to Have (High Impact)

Experience with hand pose / human pose estimation (2D/3D/6DoF)Familiarity with RGB-D / depth sensorsExperience with learning-based vision modelsMonocular depthPose estimationAction recognitionExposure to VLM or embodied AI systemsExperience optimizing for real-time systems (latency, memory, throughput)Familiarity with frameworks like:OpenCV, PyTorch, ROS, COLMAP, ORB-SLAM, OpenVINS, etc.

Nice to Have (Bonus)

Experience with multi-device synchronization (time alignment, sensor clocks)Background in robotics, AR/VR, or embodied AI systemsExperience deploying models on edge devices / mobile systems

What Makes This Role Unique

You’ll work on complex multi-sensor setups (not just single-camera CV)Ownership of end-to-end perception stack (not just modeling or infra)Blend of classical geometry + modern MLOpportunity to shape next-gen embodied / spatial AI systems

Ideal Candidate Profile

Someone Who

Can move fluidly between math, systems, and MLIs comfortable debugging real-world sensor noise and calibration issuesHas built or worked on real-time perception systems, not just offline models Click on Apply to know more.

This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.