AI Research Scientist (TFMs)

Lexsi Labs

full-time

Required skills

Python
ACL
API
data science
deep learning
Pytorch

About the role

Website: lexsi.ai
Job details:

Lexsi Labs is building the next generation of Tabular Foundation Models (TFMs) to make structured prediction as reusable and scalable as foundation models made language and vision.

We are already deep in this space with concrete systems and research artifacts:

We built TabTune, a unified library for inference, benchmarking, evaluation (including calibration and fairness), and fine-tuning across multiple leading TFMs through a single API.
We developed frontier architectures like Orion-MSP, introducing hierarchical multi-scale sparse attention and memory mechanisms designed to scale tabular in-context learning to wide, high-dimensional tables.
We are pushing the ecosystem toward Institutional Tabular Foundation Models (ITFMs), arguing for table-native predictors that capture institutional knowledge directly, without routing tables through text, and emphasizing robustness to drift, schema evolution, governance, and auditability.

Our ambition is simple and disruptive: replace fragmented task-by-task modeling with foundation models for structured decision systems, and fundamentally change how data science is practiced in real organizations.

The Role

As an AI Research Scientist (TFMs), you will design new tabular foundation model architectures, training methods, and evaluation systems that push beyond today’s best baselines and unlock transfer, scale, and reliability for structured data.

This is not “apply transformers to tables.” This role is about building what comes next: new modeling primitives for structured data, new pretraining regimes, and models that can survive real institutional complexity.

Responsibilities

Invent and prototype new TFM architectures that improve generalization, scalability (wide tables, many features), and transfer across tasks and schemas.
Develop learning paradigms for structured data including in-context learning, pretraining on synthetic/real mixtures, and efficient adaptation mechanisms.
Build systems that move TFMs toward institutional readiness, including robustness to temporal drift, schema evolution, leakage resistance, and governance-friendly evaluation.
Lead experiments across large benchmark suites and real-world regimes, with careful methodology around calibration, fairness, and reliability, not just accuracy.
Contribute to open research tooling and infrastructure that accelerates TFM research and adoption, including reproducible evaluation, benchmarking, and fine-tuning workflows.

Publish research in top venues and help shape Lexsi’s research direction in tabular foundation modeling.

Ideal Qualifications

PhD (or equivalent research track record) in ML/AI/CS/Math/Stats or a related quantitative field, with demonstrated ability to execute independent research.
Strong foundation in deep learning and representation learning, with experience designing and analyzing architectures (transformers, attention variants, memory mechanisms, sparse modeling, mixture-of-experts, or related).
Proven ability to run rigorous empirical research: dataset curation, ablations, benchmarking, reproducibility discipline, and meaningful evaluation beyond single-metric wins.
Strong engineering ability in Python and modern ML stacks (PyTorch or JAX), with comfort building research codebases that others can extend.
Depth or strong interest in structured/tabular learning, including challenges like heterogeneity, missingness, feature interactions, schema variation, drift, and real-world decision constraints.
Evidence of research impact through publications (NeurIPS/ICML/ICLR/KDD/ACL/AAAI etc.), open-source contributions, or high-quality research artifacts.

Nice to Have

Experience with tabular deep learning, tabular ICL, or foundation model pretraining regimes for structured data.
Familiarity with institutional ML constraints: drift monitoring, auditability, leakage prevention, calibration, fairness, and governance requirements.
Experience building research infrastructure or libraries used by others (evaluation harnesses, benchmarking tools, training frameworks).

What Success Looks Like

You ship new architectures that move the TFM frontier forward, especially in wide-table and cross-task transfer regimes.
You help evolve TFMs into institutional-grade predictors, with principled handling of drift, schema evolution, leakage resistance, and governance.
You contribute to a research + systems loop where new ideas quickly become reproducible experiments, open tooling, and deployable model families.

Click on Apply to know more.

This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.