Senior Data Engineer
Actualize
- Location
- Pune District, Maharashtra, India
- Job type
- Full-time
Required skills
- Python
- AWS
- Azure
- communication skills
- compliance
- computer vision
- cross-functional
- data lake
- data modeling
- data solutions
- Databricks
- ETL
- GCP
- hybrid cloud
- metadata management
- Oracle
- predictive analytics
- SAP
- Spark
- SQL
About the role
Website:
actualize.co.in
Job details:
Key Responsibilities
- Design, develop, and maintain scalable data architectures for structured and unstructured data including text, images, audio, and video.
- Build and optimize enterprise ETL/ELT pipelines using Python, SQL, Spark/PySpark, and Databricks.
- Integrate and process data from enterprise platforms such as SAP, Oracle, Azure Data Lake, and other cloud/on-prem systems.
- Develop high-performance data pipelines to support AI/ML, computer vision, predictive analytics, and Generative AI use cases.
- Implement large-scale image and video preprocessing workflows for AI-driven applications.
- Work with feature stores, vector databases, embeddings, and LLM-based data workflows.
- Ensure data quality, governance, lineage tracking, metadata management, and security compliance across platforms.
- Collaborate with AI engineers, data scientists, and cross-functional teams to deliver production-ready data solutions.
- Optimize data processing performance, scalability, and reliability in hybrid cloud environments.
- Support data modeling, storage optimization, and centralized data platform initiatives.
Required Skills & Qualifications
- 5+ years of experience in Data Engineering or related domain.
- Strong hands-on experience with Python, SQL, Spark/PySpark, and Databricks.
- Experience working with Azure/AWS/GCP cloud platforms and hybrid/on-prem environments.
- Hands-on experience with SAP, Oracle, Azure Data Lake, or equivalent enterprise systems.
- Experience building scalable ETL/ELT pipelines for AI/ML workloads.
- Knowledge of computer vision data pipelines, image/video annotation platforms, and preprocessing workflows.
- Experience with data governance, security, metadata management, and pipeline optimization.
- Familiarity with vector databases, embeddings, and Generative AI/LLM data workflows is preferred.
- Strong problem-solving, collaboration, and communication skills.
Click on Apply to know more.
This page is fully interactive when JavaScript is enabled. Please enable JavaScript to apply or browse related roles.