Flag job

Report

AI/LLM DevOps Engineer

Location

Gurugram, Haryana, India

JobType

part-time

About the job

Info This job is sourced from a job board

About the role

Website: dreampathservices.com
Job details:

Job Description – DevOps Engineer – AI/LLM Platform

Location: Gurgaon, Jaipur, Pune, Bhopal, Bangalore, Hyderabad, Chennai

Job Type : Contract 2 hire


Role Overview

We are looking for a skilled DevOps Engineer with hands-on experience in cloud infrastructure, CI/CD, and observability, along with exposure to AI/LLM-based systems. The ideal candidate will be responsible for building and managing scalable infrastructure while enabling reliable deployment and monitoring of modern applications, including AI-driven platforms.

Key Responsibilities

  • Design, deploy, and manage scalable cloud infrastructure on AWS, Azure, or GCP
  • Develop and maintain Infrastructure as Code (IaC) using Terraform across environments
  • Build and manage CI/CD pipelines using GitHub Actions for automated build, test, and deployment
  • Implement observability solutions using Prometheus and Grafana for monitoring, alerting, and visualization
  • Work with advanced observability tools such as LangFuse for monitoring LLM/AI systems
  • Utilize Harness for deployment orchestration, release management, and feature flagging
  • Automate operational tasks using Python or Bash scripting
  • Monitor system performance, troubleshoot issues, and ensure high availability of production systems
  • Collaborate with engineering teams to streamline deployment processes and improve system reliability
  • Maintain dashboards, alerts, and runbooks for proactive incident management

Required Skills & Qualifications

  • 3–7 years of experience in DevOps / Infrastructure Engineering
  • Hands-on experience with at least one cloud platform: AWS, Azure, or GCP
  • Strong expertise in Terraform (module creation, state management, multi-environment setup)
  • Experience with CI/CD tools, preferably GitHub Actions
  • Hands-on experience with Prometheus and Grafana
  • Experience in scripting (Python or Bash) for automation
  • Understanding of containerization technologies like Docker and Kubernetes
  • Strong understanding of system design, networking, and security best practices

Good to Have

  • Experience with LLM/AI observability tools such as LangFuse
  • Exposure to AI/ML or LLM-based platforms
  • Experience with Harness or similar deployment tools

Click on Apply to know more.

Skills

Python
AWS
Azure
Bash
CI
cloud infrastructure
containerization
DevOps
Docker
GCP
GitHub
Kubernetes
state management
Terraform