About Responsive
Responsive (formerly RFPIO) is the global leader in strategic response management software, transforming how organizations share and exchange critical information. The AI-powered Responsive Platform is purpose-built to manage responses at scale, empowering companies across the world to accelerate growth, mitigate risk and improve employee experiences. Nearly 2,000 customers have standardized on Responsive to respond to RFPs, RFIs, DDQs, ESGs, security questionnaires, ad hoc information requests and more. Responsive is headquartered in Portland, OR, with additional offices in Kansas City, MO and Coimbatore, India. Learn more at responsive.io.
About The Role
We are seeking a highly skilled Product Data Engineer with expertise in building, maintaining, and optimizing data pipelines using Python scripting. The ideal candidate will have experience working in a Linux environment, managing large-scale data ingestion, processing files in S3, and balancing disk space and warehouse storage efficiently. This role will be responsible for ensuring seamless data movement across systems while maintaining performance, scalability, and reliability.
Essential Responsibilities
- ETL Pipeline Development: Design, develop, and maintain efficient ETL workflows using Python to extract, transform, and load data into structured data warehouses.
- Data Pipeline Optimization: Monitor and optimize data pipeline performance, ensuring scalability and reliability in handling large data volumes.
- Linux Server Management: Work in a Linux-based environment, executing command-line operations, managing processes, and troubleshooting system performance issues.
- File Handling & Storage Management: Efficiently manage data files in Amazon S3, ensuring proper storage organization, retrieval, and archiving of data.
- Disk Space & Warehouse Balancing: Proactively monitor and manage disk space usage, preventing storage bottlenecks and ensuring warehouse efficiency.
- Error Handling & Logging: Implement robust error-handling mechanisms and logging systems to monitor data pipeline health.
- Automation & Scheduling: Automate ETL processes using cron jobs, Airflow, or other workflow orchestration tools.
- Data Quality & Validation: Ensure data integrity and consistency by implementing validation checks and reconciliation processes.
- Security & Compliance: Follow best practices in data security, access control, and compliance while handling sensitive data.
- Collaboration with Teams: Work closely with data engineers, analysts, and product teams to align data processing with business needs.
Education
- Bachelor’s degree in Computer Science, Data Engineering, or a related field.
Experience
- 2+ years of experience in ETL development, data pipeline management, or backend data engineering.
- Proficiency in Python: Strong hands-on experience in writing Python scripts for ETL processes.
- Linux Expertise: Experience working with Linux servers, command-line operations, and system performance tuning.
- Cloud Storage Management: Hands-on experience with Amazon S3, including handling file storage, retrieval, and lifecycle policies.
- Data Pipeline Management: Experience with ETL frameworks, data pipeline automation, and workflow scheduling (e.g., Apache Airflow, Luigi, or Prefect).
- SQL & Database Handling: Strong SQL skills for data extraction, transformation, and loading into relational databases and data warehouses.
- Disk Space & Storage Optimization: Ability to manage disk space efficiently, balancing usage across different systems.
- Error Handling & Debugging: Strong problem-solving skills to troubleshoot ETL failures, debug logs, and resolve data inconsistencies.
- Experience with cloud data warehouses (e.g., Snowflake, Redshift, BigQuery).
- Knowledge of message queues (Kafka, RabbitMQ) for data streaming.
- Familiarity with containerization tools (Docker, Kubernetes) for deployment.
- Exposure to infrastructure automation tools (Terraform, Ansible).
Knowledge, Ability & Skills
- Strong analytical mindset and ability to handle large-scale data processing efficiently.
- Ability to work independently in a fast-paced, product-driven environment.