Cactus Communications
Website:
cactusglobal.com
Job details:
Overview:
CACTUS is a remote-first organization and we embrace an accelerate from anywhere culture. You may be required to travel to our Mumbai office based on business requirements or for company/team events.
We are looking for a Data Scientist to develop and validate high-impact predictive models and AI algorithms. In this role, you will perform deep data exploration, feature engineering, and statistical modeling to build classification and forecasting solutions. Working closely with senior leads and AI/ML engineers, you will ensure that all data science processes align with responsible AI principles while maintaining comprehensive documentation and rigorous quality validation. If you are a hands-on problem solver with experience in large-scale data manipulation and a passion for applying advanced machine learning to real-world challenges, this role offers a platform to contribute to meaningful AI initiatives
Responsibilities:
- Develop and validate predictive models and AI algorithms under the guidance of senior data science leads.
- Perform data exploration, visualization, and feature selection for model development.
- Implement classification, clustering, and forecasting models.
- Support AI/ML engineers in model training and evaluation activities.
- Create comprehensive documentation for models and data pipelines.
- Participate in performance testing and quality validation of AI outputs.
- Ensure alignment of data science processes with responsible AI principles.
Requirements:
- B.Tech / M.Tech / M.Sc. in Data Science, Computer Science, Statistics, or Mathematics.
- Specialized training or certification in machine learning is an advantage.
- Research papers, case studies, or significant open source contributions(are preferred)
- 3–6 years of professional experience in machine learning or data analytics.
- Hands-on experience in data cleaning, feature engineering, and statistical modeling.
- Exposure to large-scale structured and unstructured datasets.
Technical Competencies:
- Programming Languages: Python (scikit-learn, pandas, numpy), R for statistical analysis, SQL for data manipulation and analysis
- ML Frameworks: TensorFlow, PyTorch, Keras, XGBoost, LightGBM, Hugging Face Transformers for model development
- Data Processing: Advanced pandas, Apache Spark (PySpark), data wrangling, and large- scale data manipulation techniques
- Statistical Analysis: Hypothesis testing, regression analysis, experimental design, and statistical modeling techniques
- Visualization Tools: Matplotlib, Seaborn, Plotly, Tableau, Power BI for data exploration and results communication
- Feature Engineering: Feature selection, dimensionality reduction (PCA, t-SNE), feature scaling, and data transformation techniques
- Model Evaluation: Cross-validation, performance metrics, ROC curves, confusion matrices, and model interpretability (SHAP, LIME)
- Database Technologies: SQL databases, NoSQL systems, and data warehouse querying for analytics and model training
About Cactus:
Established in 2002, CACTUS (cactusglobal.com) is a leading technology company that specializes in expert services and AI-driven products which improve how research gets funded, published, communicated, and discovered. Its flagship brand Editage offers a comprehensive suite of researcher solutions, including expert services and cutting-edge AI products like Mind the Graph, Paperpal, and R Discovery. With offices in Princeton, London, Singapore, Beijing, Shanghai, Seoul, Tokyo, and Mumbai and a global workforce of over 3,000 experts, CACTUS is a pioneer in workplace best practices and has been consistently recognized as a great place to work.
Click on Apply to know more.