Description
Key Responsibilities:
- Conceptualize research problems, design studies, and lead the development of advanced analytic and ML solutions across supervised, unsupervised, NLP, graph, and (where appropriate) generative-AI techniques.
- Translate ambiguous mission questions into clearly defined hypotheses, data requirements, and modeling approaches.
- Author and review implementation roadmaps, data exploration reports, model prototype evaluations, and final model analysis reports.
- Build, validate, and harden production models — including model cards, bias and fairness assessments, drift monitoring, and reproducibility artifacts.
- Lead code reviews, establish coding standards, and mentor data scientists and analysts on the team.
Continuously update and enhance analytic dashboards used to model real-world scenarios and identify potential mission impacts. - Represent the team in technical reviews, working groups, and stakeholder briefings; advise senior project personnel on technical matters.
- Stay current on emerging ML, MLOps, and responsible-AI practices and recommend adoption where they advance the mission.
Requirements
- Ten (10)+ years of relevant experience in applied research, big data analytics, statistics, applied mathematics, data science, computer science, or operations research.
- Seven (7)+ years of direct experience in machine learning.
- Master's or Ph.D. in Statistics, Applied Mathematics, Data Science, Computer Science, Operations Research, or a closely related quantitative or technical discipline. (Ph.D. may substitute for up to three years of experience.)
- Demonstrated ability to create and validate data mining methods, ML models, and analytical results delivered through reporting and visualization.
- Strong communication skills covering analysis techniques, testing, and model validation processes for both technical and non-technical audiences.
Preferred Qualifications:
- Experience in financial crime, fraud detection, regulatory analytics, supply-chain, or other high-stakes mission domains.
- Hands-on experience with modern NLP / LLMs — including retrieval-augmented generation (RAG), embedding models, fine-tuning, prompt engineering, and evaluation frameworks
- Experience with graph analytics for entity resolution, network risk, and link analysis.
- Experience with MLOps pipelines, feature stores, model registries, and production monitoring for drift and bias
Publications, patents, or open-source contributions in machine learning.
Tools & Technologies
- Languages: Python (pandas, NumPy, scikit-learn, PyTorch, TensorFlow, Hugging Face Transformers, spaCy, NetworkX), R, SQL
ML / MLOps: MLflow, Kubeflow, SageMaker, Azure ML, Vertex AI, Weights & Biases, DVC, Airflow, dbt. - LLMs & GenAI: OpenAI / Anthropic / Bedrock APIs, LangChain, LlamaIndex, vector stores (FAISS, pgvector, Pinecone, OpenSearch).
- Big data: Spark / PySpark, Databricks, Snowflake, Dask, Ray
- Visualization: Tableau, Power BI, Plotly, Streamlit, Dash.
- Cloud (gov): AWS GovCloud, Azure Government.
- Collaboration & code: Git/GitHub, Jupyter, VS Code, Docker, Kubernetes.
Clearance & Suitability
U.S. Citizenship required. Candidates must currently possess or be able to favorably pass a five (5) year federal background investigation prior to start. All candidates must clear OneGlobe's pre-screening process, which includes review for felony convictions in the past 36 months, illegal drug use in the past 12 months, relevant misconduct, and a financial background check. Work is primarily UNCLASSIFIED and performed at a federal customer site in the Washington, D.C. metropolitan area, with potential for hybrid arrangements per program policy. Occasional travel may be required.