Report

ScryAI - Data Scientist - Computer Vision

Location

Pune, Maharashtra, India

JobType

full-time

About the job

Info This job is sourced from a job board

Overview

About the role

Job Description

Must Have :

Strong hands-on experience with deep learning-based computer vision, including object detection, classification, tracking, and real-time video analytics
Practical experience with CNN-based architectures such as YOLO (v5/v8) or similar, and ability to train, fine tune, and evaluate models using PyTorch or TensorFlow
Experience building real-time vision pipelines for live video feeds (CCTV / streaming video) with low-latency constraints
Solid understanding of video analytics concepts including frame sampling, motion analysis, temporal consistency, and object tracking across frames
Strong understanding of image and video preprocessing pipelines including augmentation, normalization, and handling real-world data challenges such as low light, occlusion, motion blur, and varying camera angles
Hands-on experience deploying CV models on edge devices such as NVIDIA Jetson, Raspberry Pi, or similar embedded platforms
Exposure to model optimization techniques for edge deployment including quantization, pruning, or use of lightweight architectures
Ability to design and own end-to-end CV pipelines, from data ingestion and annotation to inference, monitoring, and performance evaluation in production
Experience working with Vision-Language Models (VLMs) or vision-enabled LLMs, and integrating vision model outputs with LLM pipelines for reasoning, event understanding, or summarization
Experience collaborating with backend and DevOps teams for production deployment, including familiarity with Docker and basic MLOps practices
Ability to evaluate and monitor model performance in production using appropriate computer vision metrics

Good To Have

Experience with edge inference frameworks such as ONNX, TensorRT, or OpenVINO
Hands-on experience with video streaming and processing frameworks such as OpenCV, RTSP, GStreamer, or similar
Exposure to multimodal AI systems combining vision with text (and optionally audio)
Experience with multi-camera setups, camera calibration, or scene-level analytics
Familiarity with LLM orchestration frameworks such as LangChain or LlamaIndex
Understanding of edge AI security, privacy, and data compliance considerations in surveillance or industrial environments
Experience working on real-world CV deployments in domains such as smart cities, retail analytics, industrial monitoring, safety systems, or large-scale surveillance.

(ref:hirist.tech)

Skills

deep learning

computer vision

object detection

classification

tracking

video analytics

cnn

yolo

pytorch

tensorflow

video preprocessing

augmentation

normalization

edge deployment

quantization

pruning

docker

mlops

onnx

tensorrt

openvino

opencv

rtsp

gstreamer

langchain

llamaindex