Flag job

Report

Data Scientist

Min Experience

3 years

Location

Remote Global

JobType

Full-Time

About the job

Info This job is sourced from a job board

About the role

Description: As a Data Scientist at Encardio, you will analyze complex time-series data from devices such as accelerometers, strain gauges, and tilt meters. Your responsibilities will span data preprocessing, feature engineering, machine learning model development, and integration with real-time systems. You'll collaborate closely with engineers and domain experts to translate physical behaviours into actionable insights. This role is ideal for someone with strong statistical skills, experience in time-series modeling, and a desire to understand the real-world impact of their models in civil and industrial monitoring.---Responsibilities 1. Sensor Data Understanding & Preprocessing o Clean, denoise, and preprocess high-frequency time-series data from edge devices. o Handle missing, corrupted, or delayed telemetry from IoT sources. o Develop domain knowledge of physical sensors and their behaviour (e.g., vibration patterns, strain profiles). 2. Exploratory & Statistical Analysis o Perform statistical and exploratory data analysis (EDA) on structured/unstructured sensor data. o Identify anomalies, patterns, and correlations in multi-sensor environments. 3. Feature Engineering o Generate meaningful time-domain and frequency-domain features (e.g., FFT, wavelets). o Implement scalable feature extraction pipelines. 4. Model Development o Build and validate ML models for: 搂 Anomaly detection (e.g., vibration spikes) 搂 Event classification (e.g., tilt angle breaches) 搂 Predictive maintenance (e.g., time-to-failure) o Leverage traditional ML and deep learning and LLMs 5. Deployment & Integration o Work with Data Engineers to integrate models into real-time data pipelines and edge/cloud platforms. o Package and containerize models (e.g., with Docker) for scalable deployment. 6. Monitoring & Feedback o Track model performance post-deployment and retrain/update as needed. o Design feedback loops using human-in-the-loop or rule-based corrections. 7. Collaboration & Communication o Collaborate with hardware, firmware, and data engineering teams. o Translate physical phenomena into data problems and insights. o Document approaches, models, and assumptions for reproducibility. ---馃幆 Key Deliverables 1. Reusable preprocessing and feature extraction modules for sensor data. 2. Accurate and explainable ML models for anomaly/event detection. 3. Model deployment artifacts (Docker images, APIs) for cloud or edge execution. 4. Jupyter notebooks and dashboards (streamlit) for diagnostics, visualization, and insight generation. 5. Model monitoring reports and performance metrics with retraining pipelines. 6. Domain-specific data dictionaries and technical knowledge bases. 7. Contribution to internal documentation and research discussions. 8. Build deep understanding and documentation of sensor behavior and characteristics. ---馃敡 Technologies Languages & Libraries 路 Python (NumPy, Pandas, SciPy, Scikit-learn, PyTorch/TensorFlow) 路 Bash (for data ops & batch jobs) Signal Processing & Feature Extraction 路 FFT, DWT, STFT (via SciPy, Librosa, tsfresh) 路 Time-series modeling (sktime, statsmodels, Prophet) Machine Learning & Deep Learning 路 Scikit-learn (traditional ML) 路 PyTorch / TensorFlow / Keras (deep learning) 路 XGBoost / LightGBM (tabular modeling) Data Analysis & Visualization 路 Jupyter, Matplotlib, Seaborn, Plotly, Grafana (for dashboards) Model Deployment 路 Docker (for containerizing ML models) 路 FastAPI / Flask (for ML inference APIs) 路 GitHub Actions (CI/CD for models) 路 ONNX / TorchScript (for lightweight deployment) Data Engineering Integration 路 Kafka (real-time data ingestion) 路 S3 (model/data storage) 路 Trino / Athena (querying raw and processed data) 路 Argo Workflows / Airflow (model training pipelines) Monitoring & Observability 路 Prometheus / Grafana (model & system monitoring)

About the company

ENCARDIO-RITE ELECTRONICS PRIVATE LIMITED

Skills

python
numpy
pandas
scipy
scikit-learn
pytorch
tensorflow
bash
fft
dwt
stft
time-series
scikit-learn
pytorch
tensorflow
keras
xgboost
lightgbm
jupyter
matplotlib
seaborn
plotly
grafana
docker
fastapi
flask
github-actions
onnx
torchscript
kafka
s3
trino
athena
argo-workflows
airflow
prometheus
grafana