Viraaj HR Solutions Private Limited
Website:
viraajhrsolutions.com
Job details:
Role & Responsibilities
- Develop and optimize data processing pipelines using PySpark to enhance data throughput and reliability.
- Collaborate with data analysts and scientists to implement scalable data solutions and support analytics initiatives.
- Design, build, and maintain efficient ETL processes to facilitate seamless data integration.
- Assist in data quality management and validation to ensure accuracy and consistency across datasets.
- Bridge data infrastructure gaps by integrating various data sources into the big data architecture.
- Document data workflows, pipelines, and processes for clarity and future scalability.
Skills & Qualifications
- Must-Have
- Proficiency in PySpark and Apache Spark framework
- Strong programming skills in Python
- Experience with SQL and relational database management
- Understanding of Big Data architecture and Hadoop ecosystem
- Hands-on experience in developing ETL pipelines
- Familiarity with data validation and quality control techniques
- Preferred
- Experience with cloud platforms like AWS or Azure
- Knowledge of data visualization tools
- Prior experience in Agile development environment
Skills: sql,python,data processing,apache spark,big data,hadoop
Click on Apply to know more.