Prophecy Technologies
Website:
prophecytechs.com
Job details:
Experience : 7+ Years
Location : Pan India (Hybrid)
Role Summary
The Senior Data Engineer is responsible for designing, building, and optimizing large-scale data extraction and delivery components within the Compliance Store Data Delivery subsystem. The role focuses on index-based data discovery, scalable data access from Snowflake and Amazon S3, and building reusable, composable data delivery skills and pipelines. This role works closely with Solution Architects, Backend/API Engineers, and QA teams to ensure secure, compliant, and performant delivery of structured and unstructured data.
Key Responsibilities
- Design and develop data extraction logic using Snowflake (including Iceberg / external tables) and Amazon S3 (Parquet format)
- Implement index-search-based data discovery (L1/L2/L3 index strategy) to identify relevant data partitions
- Build scalable data retrieval and staging mechanisms for batch and API-driven extraction requests
- Develop reusable composable Skills (microservices) for data extraction, transformation, validation, and enrichment
- Support both asynchronous batch extraction workflows and synchronous API-based queries
- Optimize query performance and extraction efficiency for large-scale compliance requests
- Collaborate with Backend/API Engineers to integrate extraction logic with FastAPI-based Data Access APIs
- Ensure proper audit logging of search, extraction, and delivery events to Data Operations Metadata (DOM)
- Work with security teams to apply Privacera / DCAP-based PII and PHI masking rules
- Support unstructured data delivery use cases (PDF, TIFF, scanned documents, attachments) via index-based retrieval
- Participate in design reviews, code reviews, and performance tuning activities
- Support SIT and UAT activities by resolving data issues and validating extraction results
Required Skills and Experience
- 7–10+ years of experience in Data Engineering or Data Platform development
- Strong hands-on experience with Snowflake, including query optimization and external / Iceberg tables
- Experience working with Amazon S3 and Parquet-based data formats
- Proficiency in Python for data processing and pipeline development
- Experience designing and building large-scale batch data extraction pipelines
- Strong understanding of data indexing, partitioning, and retrieval strategies
- Experience working with REST APIs or data access services
- Exposure to microservices-based architectures and containerized deployments
- Knowledge of PII/PHI handling, data masking, and compliance requirements is a strong plus
- Experience in healthcare, insurance, or regulated enterprise environments preferred
Preferred Qualifications
- Experience with FastAPI or similar Python API frameworks
- Familiarity with ECG or SFTP-based batch delivery mechanisms
- Experience with Spark, serverless data processing, or distributed workloads
- Exposure to audit logging and metadata-driven data platforms
- Strong problem-solving skills and ability to work in a fast-paced delivery environment
Key Interfaces
- Solution Architect – for architecture alignment and design decisions
- Backend/API Engineers – for API integration and pipeline triggers
- Security and Compliance Teams – for PII/PHI masking and access controls
- QA/Test Engineers – for data validation, reconciliation, and UAT support
- Business and Compliance Stakeholders – during UAT and acceptance phases
Click on Apply to know more.