Website:
nasugroup.com
Job details:
Exp - 7-11 Years.
Location - Bangalore & Chennai
Mode - WFO
Job Summary:
We are looking for a skilled Gen AI Platform Engineer to join our team. The ideal candidate will have 7-11 years of experience in managing LLM-based systems, with expertise in infrastructure management, prompt versioning, fine-tuning, and deployment. This role requires a strong understanding of GenAI workloads, performance tuning, scalability, and governance in cloud environments such as AWS, Azure, and Google Cloud. The engineer will play a pivotal role in optimizing the performance of AI systems and ensuring their scalability in production while building and deploying AI use cases and solutions using these platforms and tools.
Key Responsibilities:
- Manage and oversee the infrastructure for LLM-based systems, ensuring seamless operation and scalability.
- Fine-tune, evaluate, and deploy prompt-based models, leveraging industry-standard tools and platforms.
- Ensure the performance, scalability, and governance of GenAI workloads in cloud environments (AWS, Azure, Google Cloud).
- Build and deploy AI use cases and solutions using the respective platforms and tools.
- Collaborate with cross-functional teams to ensure effective deployment and performance optimization.
- Lead the evaluation and enhancement of LLM-based models through iterative testing and fine-tuning.
- Handle deployment pipelines, including CI/CD for LLM models.
- Contribute to setting up automated processes for model fine-tuning and versioning.
- Work on optimizing cloud-based infrastructure to support the growth of GenAI workloads.
Required Skills:
- Strong experience with cloud platforms such as AWS Sagemaker, Google Vertex, or Azure AI.
- Proficiency in handling LLM systems, prompt fine-tuning, and versioning.
- Hands-on experience with infrastructure management, model deployment, and optimization.
- Strong understanding of cloud architecture, performance, and scalability for GenAI workloads.
- Proficiency in Python, SQL, and Bash scripting.
- Experience with machine learning frameworks such as Hugging Face, TensorFlow, PyTorch.
- Familiarity with CI/CD pipelines, Docker, Kubernetes, and MLOps workflows.
- Strong analytical skills and ability to troubleshoot complex infrastructure issues.
Nice to Have Skills:
- Familiarity with NLP frameworks and libraries such as Hugging Face, TensorFlow, PyTorch.
- Experience in working with large-scale data processing frameworks like Apache Spark, Hadoop.
- Knowledge of model explainability and interpretability techniques for LLMs.
- Familiarity with containerization technologies (e.g., Docker, Kubernetes) for model deployment and orchestration.
- Hands-on experience with MLOps pipelines.
Click on Apply to know more.