Digit7
Website:
digit7.ai
Job details:
Computer Vision AI Engineer L2
Department: Product Development & AI Research
About the Role
We are looking for a talented Computer Vision AI Engineer with a strong background in deep
learning, object detection, segmentation, and emerging multi-modal AI technologies to join our dynamic team. In this role, you will work on cutting-edge, customer-facing AI products that
combine Computer Vision, Sensor Fusion, and LLM-powered intelligence. You will play a key
role in developing intelligent systems for retail environments, enabling accurate recognition,
understanding, and interaction with store goods and scenes.
This is a hands-on technical position ideal for someone passionate about building production
grade AI solutions that deliver real business impact.
Key Responsibilities
- Design, develop, train, and optimize Computer Vision models for object detection,
semantic/instance segmentation, and image classification focused on retail store goods
and environments.
- Build and maintain robust data processing and model training pipelines to support largescale experimentation and production deployment.
- Work on sensor fusion techniques to combine vision data with other sensor inputs for
- Research, evaluate, and implement LLM-based technologies for Computer Vision (Multiimproved accuracy and robustness. modal LLMs, Vision-Language Models, LLM agents, etc.) to solve complex retail AI challenges such as zero-shot recognition, contextual understanding, and intelligent reasoning over visual data.
- Explore and develop Agentic AI systems that can autonomously plan, reason, and act on visual inputs to enhance product capabilities.
- Optimize model performance (accuracy, speed, memory footprint) for efficient inference
- Package, containerize, and deploy AI models using Docker and docker-compose in on edge devices or cloud environments. production environments.
- Collaborate closely with product, backend, and frontend teams to integrate AI capabilities into customer-facing applications. Continuously monitor model performance in production and implement improvements through retraining, fine-tuning, or architectural changes.
- Stay up-to-date with the latest advancements in Computer Vision, Multimodal AI, LLMs, and Agentic systems, and proactively bring innovative ideas to the team.
Requirements:
Education
Bachelor’s or Master’s degree in Computer Science, with a strong specialization in Artificial
Intelligence, Machine Learning, or Computer Vision.
Experience
Minimum 2 years of hands-on experience in developing and deploying deep learning models,
with a focus on Computer Vision.
Technical Skills:
- Programming: Strong proficiency in Python
- Deep Learning Frameworks: Expertise in PyTorch and/or TensorFlow
- Computer Vision: Solid experience with Object Detection, Semantic & Instance Segmentation, and Image Classification
- Proficiency with OpenCV
- Advanced AI: Strong knowledge of Large Language Models (LLMs) in the context of Computer Vision (Vision-Language Models, Multimodal LLMs, LLaVA, CLIP, etc.)
- Knowledge of Agentic AI frameworks and architectures.
- Data & Modeling: Strong skills in data modeling, dataset curation, augmentation, and annotation pipelines for vision tasks.
- Deployment & Infrastructure: Strong knowledge of Linux operating system
- Hands-on experience with Docker and docker-compose for model packaging and
- Experience with model optimization techniques (quantization, pruning, distillation, deployment ONNX, TensorRT, etc.) is a plus.
Click on Apply to know more.