Inferyx
Website:
inferyx.com
Job details:
About the Company:
We are a global analytics company, with a mission to empower enterprises to build scalable and robust artificial intelligence & machine learning based applications and solutions. We are a team of data engineers and data scientists helping businesses with actionable intelligence and data-driven decisions. The Inferyx platform is an end to end data and analytics platform that lets you disrupt and accelerate with data.
Job Title: Lead Data Engineer
Experience: 8+ Years
Location : Pune
Employment Type: Full-time
Job Description:
We are looking for a Lead Data Engineer who is responsible for the design and development of scalable data pipelines and integrations to support continual increases in data volume and complexity. Work with analytics and business teams to understand their needs, create pipelines, improve data models that feed BI and visualization tools.
Key Responsibilities:
Data Pipeline Development
Design, develop, and maintain scalable batch and streaming data pipelines using Apache Spark (PySpark/Scala) and Databricks. Build end-to-end ETL/ELT workflows for ingesting, transforming, and validating data from diverse source systems while ensuring data accuracy, reliability, and performance.
Data Modeling & Analytics Enablement
Design and maintain efficient data models, schemas, and curated datasets that support business analytics, reporting, and visualization tools. Optimize data structures for performance, scalability, and cost across lakehouse and data warehouse platforms.
Data Integration
Integrate data from multiple internal and external sources, including relational databases, APIs, flat files, and streaming sources. Ensure seamless and reliable data movement across cloud platforms, data lakes, and analytics systems.
Performance Optimization
Identify and resolve performance bottlenecks in Spark jobs, Databricks workloads, and data storage layers. Tune Spark configurations, optimize queries, and improve pipeline efficiency to support large-scale data processing.
Data Quality & Governance
Implement data quality checks, validation rules, and governance standards to ensure trustworthy data. Monitor data quality metrics and proactively address data issues in collaboration with stakeholders.
Collaboration & Stakeholder Engagement
Work closely with data analysts, data scientists, and business teams to understand requirements and deliver data solutions aligned with business objectives. Partner with platform and cloud teams to ensure architectural consistency and best practices.
Documentation & Best Practices
Document data pipelines, data models, and technical designs. Follow best practices for software development, version control, CI/CD, and deployment in distributed data environments.
Continuous Improvement
Stay current with emerging data engineering technologies, Spark and Databricks enhancements, and cloud data platform innovations. Drive automation and process improvements to increase reliability, scalability, and developer productivity.
Required Skills and Qualifications:
● Bachelor's degree in Computer Science, Engineering, or related field.
● 8+ years of experience in data engineering or related roles.
● Proficiency in programming languages such as Python, Java, or Scala.
● Strong SQL skills and experience with relational databases (e.g., MySQL, PostgreSQL).
● Experience with data warehousing concepts and technologies (e.g., Snowflake, Redshift).
● Familiarity with big data processing frameworks (e.g., Apache Spark, Hadoop).
● Hands-on experience with ETL tools and data integration platforms.
● Knowledge of cloud platforms such as AWS, Azure, or Google Cloud Platform.
● Understanding of data modeling principles and data warehousing design patterns.
● Excellent problem-solving skills and attention to detail.
● Strong communication and collaboration skills, with the ability to work effectively in a team environment.
Click on Apply to know more.