Website:
bytezera.com
Job details:
AI / ML Ops Engineer
Company: Bytezera
Company Description
Bytezera is a data services provider that specialise in AI and data solutions to help businesses maximise their data potential. With expertise in data-driven solution design, machine learning, AI, data engineering, and analytics, we empower organizations to make informed decisions and drive innovation. Our focus is on using data to achieve competitive advantage and transformation.
Experience: 1+ year
Location: Remote / Hybrid
Employment Type: Full-time
Bytezera is an AI and data solutions provider helping businesses unlock the full potential of their data through AI, machine learning, data engineering, analytics, and cloud-native solutions.
We are looking for an AI / ML Ops Engineer with hands-on experience or strong project exposure in Generative AI, Agentic AI, RAG pipelines, small model fine-tuning, FastAPI, AWS deployment, and API security.
Key Responsibilities
- Build and deploy LLM-based AI agents using LangGraph, CrewAI, AutoGen, LangChain, or LlamaIndex.
- Develop FastAPI-based backend services for AI agents, model inference, RAG workflows, and automation APIs.
- Support self-hosted small model fine-tuning and inference using open-source models such as Llama, Mistral, Phi, Gemma, Qwen, T5, or BERT.
- Work with LoRA, QLoRA, PEFT, instruction tuning, quantization, and model evaluation.
- Build RAG pipelines covering ingestion, chunking, embeddings, vector search, reranking, and response generation.
- Containerize and deploy AI services using Docker, AWS ECS, ECR, EC2, S3, Lambda, CloudWatch, and API Gateway.
- Build and maintain GitLab CI/CD pipelines for testing, Docker builds, ECR image pushes, and ECS deployments.
- Support secure API design using JWT, OAuth/OIDC basics, API keys, rate limiting, request validation, CORS, and secure secret handling.
- Monitor AI systems for latency, failures, hallucination, cost, model quality, and performance issues.
Required Skills
- 1+ year of experience in AI engineering, ML engineering, backend development, data science, or MLOps.
- Strong programming skills in Python.
- Experience or project exposure with FastAPI.
- Understanding of Generative AI, NLP, RAG, LLM applications, or AI agents.
- Exposure to Hugging Face, PEFT, LoRA/QLoRA, or open-source model fine-tuning.
- Working knowledge of Docker, GitLab CI/CD, AWS ECS, ECR, S3, Lambda, EC2, CloudWatch, and API Gateway.
- Basic understanding of API security, authentication, authorization, JWT, OAuth/OIDC, and secrets management.
- Familiarity with vector databases such as ChromaDB, FAISS, Qdrant, Pinecone, Weaviate, or pgvector.
Nice to Have
- Experience with Ollama, vLLM, TGI, or Triton.
- Exposure to MLflow, RAGAS, DeepEval, LangSmith, or Weights & Biases.
- Knowledge of PostgreSQL, pgvector, Redis, Terraform, or AWS Secrets Manager.
- Awareness of prompt injection, hallucination control, guardrails, and AI safety.
Qualifications
Bachelor’s degree in Computer Science, AI/ML, Data Science, Software Engineering, or a related field.
Strong hands-on projects, internships, GitHub portfolio, or equivalent practical experience will also be considered.
Click on Apply to know more.