Position: NLP Engineer
Location: On-site
Type: Full Time
About the Role:
We are seeking a driven NLP Engineer who can help scale, optimize, and deploy large language model (LLM)-based solutions within the healthcare domain. The primary focus of this role is on building and maintaining production-grade, end-to-end NLP systems—including backend architecture design, inference optimization, and efficient model deployment pipelines. While there will be opportunities to train or fine-tune LLMs for specific use cases, your core responsibility is to ensure that these models run at scale, efficiently, and reliably in production environments. In addition to working with cutting-edge LLMs, you will also build and maintain NLP pipelines utilizing already-trained LLMs and embedding models. This includes constructing retrieval-augmented generation (RAG) systems and agentic systems that integrate multiple models and data sources to deliver robust, real-time NLP functionalities.
What We Expect You to Bring (These are essentials!):
- Bachelor's or Master's degree in Computer Science or related field.
- 2 years of professional experience (or 1+ year with an advanced degree) in building and deploying ML/NLP systems using Python.
- Proficiency in working with NLP frameworks (e.g., spaCy, HuggingFace Transformers, LangChain, etc), deep learning libraries (e.g., PyTorch), and common data preprocessing techniques.
- Practical experience in designing, implementing, and maintaining robust, scalable backend infrastructures for NLP and LLM-based applications.
- Strong knowledge of containerization and version control for building reliable, production-grade systems.
- Experience with large datasets: data cleaning, preprocessing, and structuring.
- Hands-on experience optimizing LLM inference performance using frameworks like vLLM, TensorRT, Ray, etc.
- Experience deploying NLP models in production environments, including load balancing and latency reduction.
We Definitely Want You If You Have:
- Familiarity with building retrieval-augmented generation (RAG) pipelines and integrating embedding models into NLP workflows.
- Exposure to agentic systems that combine multiple models or tools for more dynamic, context-aware NLP solutions.
- Understanding of prompt engineering, model fine-tuning, and large-scale inference optimization for LLMs.
What You Will Be Doing:
- Production-Grade NLP Systems:
-Design and implement scalable, efficient NLP pipelines leveraging already-trained LLMs and embedding models.
-Integrate RAG and agentic components to enhance the capabilities and adaptability of NLP systems. - Inference Optimization & Deployment:
-Optimize model inference performance, reduce latency, and improve throughput using techniques and frameworks designed for large-scale LLM deployments.
-Implement best practices for containerization, CI/CD, monitoring, and observability to ensure rapid, reliable deployments. - Occasional Model Adaptation:
-As needed, assist with fine-tuning or adapting LLMs to specific healthcare use cases, while maintaining a focus on long-term scalability and performance. - Collaboration & Continuous Improvement:
-Work closely with cross-functional teams—including NLP researchers, backend engineers, product managers, and front-end developers—to deliver high-quality NLP solutions.
-Participate in code reviews, contribute to architectural discussions, and remain current on emerging NLP and LLM optimization techniques.
Why Join?
- We are revolutionizing a unique industry that has the potential to impact and benefit patients from all over the world - you can create impact at scale.
- We have access to the best computing resources available, including the H100 and A100, among others.
- We have had company-sponsored workations in Bali, Sri Lanka, and Manali and take pride in our hard-working yet super fun culture.
- We are working on a few of the most challenging problems in a highly regulated industry, which provides you with an opportunity to solve some of the most interesting things.
- You will get a chance to work with experts from multiple industries, the best in the industry compensation, and to continue building your own (and, of course, new) projects.