Ishan Technologies
Website:
ishantechnologies.com
Job details:
Note :Local candidates from Bangalore only.
Job Summary:
We are looking for a highly skilled GCP DevOps Engineer with hands-on experience in AI/ML workloads and GPU-based infrastructure. The ideal candidate will be responsible for designing, deploying, and optimizing cloud-native solutions on Google Cloud Platform (GCP), with a strong focus on performance, scalability, and cost efficiency. This role requires expertise in DevOps practices, cloud automation, and high-performance computing (HPC), along with the ability to collaborate with cross-functional teams and clients.
Key Responsibilities:
- Design, deploy, and manage scalable AI/ML infrastructure on Google Cloud Platform (GCP).
- Manage and optimize GPU/TPU-based workloads and clusters for high performance and cost efficiency.
- Build, implement, and maintain CI/CD pipelines for application and machine learning workflows.
- Monitor system performance and proactively troubleshoot issues related to GPU utilization, Kubernetes workloads, and network performance.
- Collaborate with data scientists and development teams to productionize machine learning models.
- Ensure high availability, reliability, and scalability of cloud infrastructure.
- Implement and enforce security, governance, and compliance standards within GCP environments.
- Drive cost optimization strategies, including budgeting and resource utilization tracking.
- Maintain and enhance monitoring and logging frameworks for proactive issue detection.
- Prepare and maintain technical documentation, SOPs, and architecture diagrams.
- Provide client-facing support, including solution explanations and issue resolution.
- Continuously evaluate and implement improvements in DevOps processes and cloud infrastructure.
Technical Skills & Expertise:
- Strong hands-on experience with Google Cloud Platform (GCP) and its core services.
- Expertise in:
- Compute Engine, Google Kubernetes Engine (GKE)
- Cloud Storage, IAM & Access Management
- Folder and Project hierarchy design
- Security Command Center and governance frameworks.
- Experience with AI/ML platforms and pipelines, including BigQuery and GCP AI services.
- Proven experience in handling GPU/TPU workloads, including scheduling, optimization, and cost control.
- Strong knowledge of containerization and orchestration tools:
- Docker, Kubernetes, Helm.
- Experience with monitoring and logging tools:
- Cloud Monitoring, Cloud Logging, Prometheus, Grafana.
- Proficiency in Linux systems administration and performance tuning.
- Understanding of networking concepts and cloud architecture design.
- Familiarity with infrastructure automation and DevOps best practices.
- Knowledge of high-performance computing (HPC) environments is an added advantage.
Qualification:
- Bachelor’s degree in computer science, Information Technology, or a related field.
- 4–5 years of experience in DevOps / Cloud Engineering roles.
- Hands-on experience with Google Cloud Platform (GCP) in production environments.
- Experience working with AI/ML workloads and accelerator-based infrastructure (GPU/TPU).
- Strong understanding of DevOps practices, CI/CD pipelines, and cloud automation.
- Relevant GCP certifications (e.g., Professional Cloud DevOps Engineer) are preferred.
Click on Apply to know more.