About the role

As a pivotal member of our team, you will lead the development of a robust data lake infrastructure to enable sophisticated predictive analytics insights. 


  • Design and Develop Data Lake: Spearhead the design, architecture, and implementation of a scalable and efficient data lake leveraging technologies such as Hadoop, Spark, and Avro. 
  • Data Ingestion and Integration: Identify, gather, and integrate diverse data sources into the data lake, ensuring data quality, security, and compliance. 
  • Data Cleansing and Transformation: Implement advanced data cleansing and transformation processes to ensure data integrity and consistency, enabling accurate predictive modelling. 
  • NLP, LLM, CNN Expertise: Apply specialized knowledge in Natural Language Processing (NLP), Large Language Models (LLM), and Convolutional Neural Networks (CNN) for advanced analytics. 
  • Data Augmentation Techniques: Implement data augmentation strategies to enhance dataset diversity and model performance. 
  • Anomaly Detection: Develop and apply anomaly detection techniques to identify and address irregularities in the dataset. 
  • Collaborate with Data Scientists: Work closely with data scientists to understand their requirements and provide them with the necessary data infrastructure and tools for building and deploying machine learning models. 
  • Implement ML Pipelines: Design and deploy end-to-end machine learning pipelines, including data preprocessing, model training, evaluation, and deployment. 
  • Optimize for Performance: Continuously monitor and fine-tune the data lake and machine learning infrastructure for optimal performance, scalability, and costeffectiveness. 
  • Security and Compliance: Ensure that data storage and processing within the data lake meet security and compliance standards, including GDPR, HIPAA, etc. 
  • Documentation and Reporting: Create detailed documentation for data lake architecture, processes, and workflows. Provide regular updates and reports on the progress of data lake initiatives.


