Report

Deploy Existing YOLO Object Detection Code to AWS Inferentia (Inf1/Inf2) Instance

Min Experience

0 years

Location

India

JobType

Entry Level

Overview

About the role

We are looking for an experienced ML/DevOps engineer to help us migrate and deploy an existing YOLO-based object detection codebase (currently running locally) to an Amazon EC2 Inf1 or Inf2 instance (Inferentia-based) for optimized inference performance. What We Have: A working Python codebase running inference using a trained YOLO object detection model Input is image only Currently running inference on local GPU/CPU Model format is either PyTorch or ONNX (we can adjust if needed) What We Need: Set up an Inf1/Inf2 AWS EC2 instance correctly with all required dependencies Modify or optimize the existing inference code (if needed) to work efficiently on the Inferentia chip Ensure proper performance (low latency, optimized throughput) and smooth execution Help us with benchmarking performance before vs after migration Documentation and instructions for future deployments Ideal Skills: Experience with AWS Inferentia (Inf1/Inf2) and AWS Neuron SDK Proficiency in YOLO models (v5, v6, v7 or v8) and deep learning inference optimization Familiarity with PyTorch, ONNX, and model conversion Strong experience with AWS EC2 setup, deployment, and cost optimization Ability to troubleshoot model compatibility and inference bottlenecks Deliverables: Fully functioning YOLO inference code running on AWS Inferentia instance Any model conversion/optimization steps (if required) Deployment documentation Bonus (Not required but nice to have): Experience with containerization (Docker) and deploying inference using FastAPI or Flask To Apply: Please share: Relevant experience with Inferentia and YOLO Any similar deployments or optimizations you've done Time estimate for completing this migration Looking forward to working with a great ML engineer to scale our detection pipeline!

Skills

Amazon Web Services

Python

TensorFlow

Neural Network

Machine Learning