Seventh Contact Hiring Solutions
Website:
seventhcontact.com
Job details:
Job Title: Senior Data Engineer (Python and Java)
Location: Pune - Baner(hybrid)
Years of experience: 3-7 yrs.
Responsibilities:
Core Data Engineering Responsibilities :
Architect and implement large-scale data pipelines using Python and Java.
Design and maintain ETL/ELT workflows, supporting analytics, operational systems, and ML pipelines.
Build optimized data models and work with relational (PostgreSQL) and document databases (MongoDB).
Develop and scale data micro services and APIs using Python or Java.
Implement data quality checks, monitoring, and data reliability best practices.
Optimize the performance of large data processing pipelines through parallelization, batching, and caching strategies.
AI/ML/GenAI Collaboration :
Work with ML engineers to understand data needs for model training, feature engineering, and inference pipelines.
Contribute to building foundational components of RAG pipelines, embeddings, and vector storage—with guidance from AI systems engineers.
Understand basics of LLMs, embedding, and vector databases (Pinecone, Weaviate, FAISS, etc.).
Assist in integrating AI-based search, summarization, or automation features into data workflows.
Participate in upskilling programs and cross-team learning to grow your GenAI and AI engineering capabilities.
Platform, DevOps & CI/CD (Awareness-Level)
Work with DevOps teams to implement CI/CD for data pipelines (AWS, Kubernetes, Terraform, etc.).
Understand provisioning, monitoring, and observability for data systems.
Contribute to infrastructure-as-code practices in collaboration with the platform team.
Leadership & Ownership
Provide mentorship to junior data engineers.
Lead problem-solving sessions and troubleshoot data pipeline issues within defined SLAs.
Collaborate actively with product, ML, and engineering teams to ensure data readiness.
Skills & Qualifications
Must-Have (Core Competencies)
Bachelor’s/Master’s degree in Computer Science, Data Engineering, or related field.
5+ years of experience in data engineering, with strong production-level delivery experience.
Expert-level proficiency in Python and strong experience with Java.
Deep experience with data modelling, ETL pipeline design, and workflow orchestration (Airflow, Prefect).
Hands-on experience with PostgreSQL and MongoDB.
Strong grasp of distributed systems, micro services, and high-performance data processing.
Nice-to-Have
Exposure to LLMs, embeddings, and AI/ML concepts.
Basic understanding of vector databases (Pinecone, FAISS, Weaviate, and Chroma).
Familiarity with prompt engineering, GenAI concepts, RAG pipeline basics.
Understanding of AI agent frameworks (Lang Chain, Llama Index, CrewAI, etc.).
Working knowledge of AWS services, Kubernetes, or data pipeline CI/CD.
Preferred :
Experience in insurance, fintech, or regulated data environments.
Click on Apply to know more.