Data Engineer

Weekday (YC W21)

Location: Bengaluru, Karnataka, India
Job type: Full-time

Required skills

Python
Airflow
AWS
caching
cloud infrastructure
communication skills
cross-functional
data ingestion
database
Docker
frontend
GCP
Lambda
Linux
Pandas
PostgreSQL
proxy
Redis
SQL
statistics
Websockets

About the role

Weekday (YC W21)

Website: weekday.works
Job details:
This role is for one of the Weekday's clients

Salary range: Rs 2500000 - Rs 3000000 (ie INR 25-30 LPA)

Min Experience: 3 years

Location: Bangalore

JobType: full-time

Requirements

Key Responsibilities

Take full ownership of the data engineering function from start to finish — covering data ingestion through to the serving layer — operating as a highly autonomous individual contributor
Develop both real-time and batch ingestion pipelines for prediction market APIs (Polymarket Gamma/CLOB, Kalshi), sportsbook odds feeds (Pinnacle), and statistical sources (HLTV, ESPN, flashscore)
Architect and execute a Medallion architecture (Bronze → Silver → Gold) for market data, on-chain orderbook snapshots, historical odds, match results, and player/team statistics
Create the feature store that powers our AI edge models, including Elo ratings, Bradley-Terry map-veto probabilities, Bayesian calibration indicators, and Kelly sizing inputs
Implement WebSocket listeners and streaming infrastructure to track live odds fluctuations, in-play probability updates, and orderbook depth changes, targeting sub-second latency
Construct nightly batch pipelines for data needed in model retraining — including historical odds versus outcomes, walk-forward backtesting datasets, and profit & loss reconciliation across exchanges
Establish cloud infrastructure (AWS/GCP), manage job orchestration (Airflow/Prefect/cron), and deploy monitoring, alerting, and data quality checks throughout all pipelines
Develop data APIs and caching layers that provide the trading terminal frontend with standardized, low-latency market data across all supported exchanges
Conduct research and development on new data sources, scraping techniques, and tools to continuously broaden market coverage and enhance data freshness

Qualifications

Minimum of 3 years of practical Data Engineering experience building scalable production pipelines
Proficient in Python (including Pandas, asyncio, aiohttp, requests, BeautifulSoup/Scrapy) and advanced SQL skills
Experience with PostgreSQL, Redis/DragonflyDB, and cloud platforms such as AWS (S3, Lambda, RDS) or their GCP equivalents
Hands-on expertise with WebSockets, streaming data, and real-time event-driven system architectures
Skilled in REST API integration, webhook configuration, and large-scale web scraping (handling rate limits, proxy rotation, anti-bot measures)
Familiarity with workflow orchestration tools (Airflow, Prefect, or Dagster) and CI/CD pipelines on Linux/Docker environments
Strong foundation in database design, Medallion/lakehouse architectures, and data modelling for analytical purposes
Clear, well-structured communication skills in English
Ability to independently manage execution, proactively troubleshoot, and resolve issues without supervision
A meticulous data quality mindset, ensuring data accuracy and reliability instinctively
Resourceful problem solver, capable of devising quick workarounds and fixes for API failures or changes in rate limits
Maintains thorough documentation for schema definitions, pipeline DAGs, and failure runbooks
Comfortable with rapid iteration cycles and designing extensible pipeline architectures
Strong cross-functional communication skills to clearly convey technical information to non-technical stakeholders

Skills

Python, SQL, PostgreSQL, Redis, AWS, GCP, API Integration, Web Scraping, Airflow, Prefect, Dagster, WebSockets, Streaming Data, Docker, Data Modelling Click on Apply to know more.

This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.