LLM Engineer – LLM & Generative AI

Aerocraft Engineering

full-time

Required skills

LangChain
Python
Azure
end-to-end
GCP
Pytorch

About the role

Aerocraft Engineering

Website: aerocraftengineering.com
Job details:

About the job

About Our Company:

Aerocraft Engineering India Pvt Ltd based in Ahmedabad, provides services to US-based Architecture, Engineering, and Construction groups of companies:

Russell and Dawson – An Architecture/Engineering/Construction firm (www.rdaep.com)
United-BIM – BIM Modeling Services Firm (www.united-bim.com)
AORBIS – Procurement as a Service Provider (www.aorbis.com)

We are a nimble and growing organization where everyone’s role is very important for the company’s business success. All team members’ contributions have a direct correlation with the company’s performance in meeting its business and financial objectives.

We are looking for an Experienced AI Engineer with hands-on experience building production-grade applications powered by Large Language Models. You will design, develop, and optimize LLM-based systems — with a strong focus on scalable Retrieval-Augmented Generation (RAG) pipelines and model fine-tuning — to

deliver intelligent solutions that drive real business impact.

Experience Required - Minimum 3+ Years

Job Location:

Ahmedabad (Siddhivinayak Towers, Makarba)

Shift Timings:

11am to 8pm/ 9am to 6pm (Shift may change as per business requirement)
Monday to Friday
Work from office

Immediate Joiner is Preferred.

Key Responsibilities

Design, build, and maintain scalable RAG systems for knowledge-intensive applications, including document ingestion, chunking strategies, vector store management, and retrieval optimization.
Fine-tune open-source and proprietary LLMs using parameter-efficient techniques such as LoRA and QLoRA to adapt models for domain-specific use cases.
Develop and deploy end-to-end LLM-powered applications including chatbots, agents, summarization tools, and search systems.
Evaluate model performance using quantitative metrics (BLEU, ROUGE, perplexity) and qualitative benchmarks; iterate on prompt engineering and fine-tuning strategies accordingly.
Optimize inference pipelines for latency, cost, and throughput across cloud and on-premise environments.
Collaborate with data engineers, product managers, and stakeholders to translate business requirements into AI-driven solutions.
Stay current with rapidly evolving LLM research, tools, and frameworks, and advocate for best practices across the team.

Must-Have Requirements

3+ years of experience in AI/ML engineering, with at least 3+ year focused on LLMs and generative AI.
Proven experience developing scalable RAG systems (vector databases such as Pinecone, Weaviate, Qdrant, Chroma DB, or FAISS; embedding models; retrieval and re-ranking strategies).
Hands-on experience with model fine-tuning using LoRA, QLoRA, or similar PEFT techniques on frameworks like Hugging Face Transformers, PEFT, or Axolotl.
Strong proficiency in Python and ML frameworks (PyTorch, Transformers, LangChain, LlamaIndex).
Solid understanding of transformer architectures, attention mechanisms, and tokenization.
Experience with cloud platforms (AWS, GCP, or Azure) for model training and deployment.

Nice-to-Have

Experience with multi-agent frameworks (AutoGen, CrewAI, LangGraph).
Familiarity with model serving tools such as vLLM, TGI, or Triton Inference Server.
Knowledge of MLOps practices — experiment tracking (W&B, MLflow), CI/CD for ML, and model monitoring.
Experience with quantization techniques (GPTQ, AWQ, GGUF) for efficient deployment.
Contributions to open-source AI/ML projects.

Benefits:

• Exposure to US Projects/Design/Standards

• Company provides Dinner/Snacks/Tea/Coffee

• 5 Days Working Week

• 15 paid leave annually & 8-10 Public Holidays

Click on Apply to know more.

This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.