Pace Wisdom Solutions
Website:
pacewisdom.com
Job details:
Job Description:
Location: Bangalore (Hybrid)
Role: Data Engineer – Business Insights
Experience Required
5–7+ years of experience in Data Engineering, Big Data, or Analytics Engineering.
Role Overview and Key responsibilities
We are looking for a highly skilled Data Engineer to build and manage scalable data pipelines for our 4PL (Fourth-Party Logistics) Business Insights platform. The ideal candidate will design and implement robust ingestion, transformation, and analytics-ready data infrastructure that powers AI-driven business insights and operational intelligence.
- This role will be responsible for building end-to-end pipelines from Existing Kafka spine + Debezium CDC + Apache Flink for streaming transformation along with supporting bulk ingestion from CSV and other flat-file sources
- Would need the candidate to have working experience with Apache Iceberg on Amazon S3
- Should be familiar with ClickHouse for building customer dashboards and Trino/Athena for historical queries
- Design, develop, and maintain scalable data pipelines for ingesting logistics and operational data into the analytics platform.
- Strong SQL skills and experience optimizing analytical queries.
- Familiarity with containerization and cloud-native deployments.
- Proficiency in Python, Scala, or Java.
Data Lake & Warehouse Management
- Manage and optimize data flow from Kafka topics into S3-based storage layers. Build ETL/ELT pipelines to transform and load data into ClickHouse for high-performance analytical querying.
- Design partitioning, indexing, and schema strategies in ClickHouse for low-latency AI and BI workloads.
AI & Analytics Enablement
- Enable AI agents and analytics applications to efficiently query ClickHouse datasets.
- Ensure data quality, consistency, and availability for downstream AI-driven insights.
- Collaborate with AI/ML teams to expose optimized datasets and semantic models.
Platform Reliability & Optimization
- Monitor and optimize pipeline performance, storage efficiency, and query latency.
- Implement observability, alerting, and retry mechanisms for ingestion pipelines.
- Ensure scalability, fault tolerance, and data governance best practices.
Collaboration
- Work closely with:
- Product teams
- Business Insights teams
- AI/ML engineers
- Platform engineering teams
- Participate in architecture discussions and contribute to long-term data platform strategy.
Required Skills & Qualifications
Technical Skills
- Strong experience in building distributed data pipelines.
- Hands-on expertise with:
Data Engineering Concepts
- ETL/ELT pipeline design
- Data modeling for analytics
- Data partitioning and indexing strategies
- Schema evolution and metadata management
- Monitoring and observability
Nice to Have
- Experience with logistics, supply chain, or 4PL platforms.
- Exposure to AI/LLM-based analytics systems.
- Familiarity with vector search or AI retrieval architectures.
- Experience with dbt or modern data stack tools.
- Knowledge of Iceberg, Delta Lake, or Parquet optimization.
Preferred Qualifications
- Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.
- Experience working in high-scale analytics or real-time data environments.
Success Metrics
- Reliable real-time and batch ingestion pipelines.
- Optimized ClickHouse performance for AI-agent querying.
- Reduced latency for analytics and reporting workloads.
- High data quality and pipeline uptime.
Click on Apply to know more.