Website:
bluvinsolutions.com
Job details:
Location: Remote
Shift Timings : General Shift
Notice Period: (Immediate Joiner - Only)
Experience: 3 - 4 Years
Job Description
Mandatory Skills: Python (3.9+), Py Spark & Spark Internals, Databricks, Statistics/ML Libraries (stats models, scikit-learn, SciPy, Pandas, NumPy DID, Synthetic Control, A/B testing, hypothesis testing, panel data methods), API Development, Azure Cloud Platform, Kubernetes, Docker, Py Test.
Role Overview:
We're looking for an ML Engineer to join our Test & Learn Platform team. You'll build and scale our experimentation and causal inference services — from statistical engines to API integrations and cloud pipelines — empowering business teams globally to make data-driven decisions.
Responsibilities:
- Develop and maintain statistical/ML modules (DID, Synthetic Control, A/B Testing, Multi-Treatment
- Effects) in Python
- Build and extend Fast API services and integrate them with our web application via SDK wrappers
- Design and optimize large-scale data pipelines using PySpark, Delta Lake, and Azure Data Lake
- Profile and resolve OOM issues in PySpark jobs - optimize memory allocation, partitioning, broadcast
- joins, caching strategies, and Spark configurations
- Deploy and manage workloads on Databricks, including job clusters, notebooks, and Delta Lake tables
- Containerize and deploy services using Docker, Kubernetes, and CI/CD pipelines
- Ensure code quality and security via Sonar Cloud, Snyk, and PyTest
- Collaborate with data scientists and product teams to translate research into production-ready modules
Requirements:
- 3+ years of production experience in Python (3.9+)
- PySpark & Spark Internals - strong experience with Spark memory model, executor tuning, shuffle
- optimization, and diagnosing/resolving OOM errors (broadcast thresholds, partition skew, spill-to-disk,
- GC tuning)
- Databricks - hands-on with job orchestration, cluster configuration, notebook workflows, and Delta Lake
- optimization (Z-ordering, compaction, caching)
- Causal Inference & Experimentation - DID, synthetic control, A/B testing, hypothesis testing, panel data
- methods
- Statistics/ML Libraries - statsmodels, scikit-learn, scipy, pandas, numpy
- API Development - building RESTful services with FastAPI (or similar)
- Cloud (Azure) - Azure Storage, Azure ML, Data Lake
- Docker & Kubernetes - containerization and orchestration for ML workloads
- Testing - writing robust unit/integration tests with pytest
Good-to-Have:
- Experience with Celery/Redis for async task orchestration
- Familiarity with Polars, PyArrow, or SQL Alchemy
- Background in econometrics or experimental design
- Spark UI profiling and performance benchmarking
- CI/CD tooling (Sonar Cloud, Snyk, GitHub Actions)
Click on Apply to know more.