Crudcook
Website:
crudcook.com
Job details:
Company Description
Crudcook is a specialist talent solutions partner focused on senior technical hiring for high-growth technology and fintech companies. We work with venture-backed startups, scaling businesses, and category-defining teams to identify, engage, and place senior engineering and AI/ML talent across India and globally. Our approach combines deep technical understanding with a curated, relationship-led search process — we partner with a small, focused set of clients to ensure depth, quality, and meaningful candidate experiences. Our team has supported hiring across some of the most innovative startups and global technology organizations, with a track record of placing senior engineers, machine learning practitioners, and technical leaders into roles where they can do their best work. We believe great hiring is about fit, not filtering — and we're committed to a candidate-first process that respects both the time and the trajectory of the people we represent.
Role Description
This is a remote role for an Applied AI/ML Engineer (Foundation Models & Data), hired on behalf of our client — a well-funded fintech building payments infrastructure at scale. The Machine Learning Engineer will own the foundation model layer of the company's stack end-to-end, including fine-tuning open-weight large language models on proprietary transaction, partner, and operational data; designing model pipelines that move from raw event data to production inference; and building the data infrastructure that supports the full machine learning lifecycle. The role also involves applying classical machine learning where it is the right tool — including liquidity and volume forecasting, anomaly detection across transaction flows, and partner behavior modeling. The Machine Learning Engineer will work closely with the founding team, senior engineers, and cross-functional stakeholders to ship reliable, production-grade systems in a regulated, latency-sensitive domain. This is a first-ML-hire role, which means the scope is unusually broad, the ownership is real, and the engineer will help build the machine learning function from the ground up. Collaboration, systems thinking, technical leadership, and continuous learning are core aspects of this role.
Qualifications
- 4–6 years of experience building production machine learning systems, with significant hands-on work on transformer-based models
- Demonstrable experience fine-tuning open-weight LLMs (Llama, Qwen, Mistral, Gemma) using techniques such as LoRA, QLoRA, full fine-tuning, DPO, ORPO, or continued pre-training
- Deep understanding of transformer architecture, including attention mechanisms, positional encodings, tokenization tradeoffs, and context length considerations
- Proven track record of shipping at least one fine-tuned LLM to production
- Strong foundation in classical machine learning and forecasting — gradient boosting, time-series methods (Prophet, statsforecast, SARIMA), and statistical reasoning
- Experience designing and optimizing machine learning models across both classical and deep learning paradigms
- Proficiency in Python, with fluency in PyTorch and the Hugging Face ecosystem (transformers, peft, trl, datasets)
- Hands-on experience with at least one inference server such as vLLM, TGI, or SGLang
- Real data engineering capability — SQL fluency, pipeline orchestration, schema design, and familiarity with feature store concepts including point-in-time correctness and online/offline parity
- Comfort with Google Cloud Platform (Vertex AI, GKE, BigQuery, GCS) or equivalent experience on AWS or Azure
- Strong foundation in computer science, algorithms, statistics, and applied mathematics
- Strong analytical, problem-solving, and design-documentation skills
- Bachelor's or Master's degree in Computer Science, Machine Learning, Statistics, or a related field
- Experience with agent frameworks, Model Context Protocol (MCP), tool-use evaluation, or multi-agent orchestration is a plus
- Background in fintech, payments, fraud, or other regulated domains is a plus
- Open-source contributions to machine learning or LLM tooling is a plus
- Distributed training experience (FSDP, DeepSpeed, multi-node) is a plus
- Experience with liquidity, treasury, or financial forecasting in a payments or trading context is a plus
Looking forward for your application!
Click on Apply to know more.