Prismberry Technologies
Website:
prismberry.com
Job details:
Role Overview
We are seeking a hands-on Database Engineer with a "reliability mindset" to own the stability, performance, and scalability of our mission-critical data layer. You will manage high-velocity data stores in a dynamic e-commerce environment, focusing on Aerospike, Cassandra, or Redis. The ideal candidate combines deep database internals knowledge with strong automation skills (Python/Shell) to minimize toil and ensure 99.99%+ uptime.
Key Responsibilities
● Database Operations: Manage the full lifecycle of NoSQL clusters (Aerospike, Cassandra, Redis), including provisioning, upgrades, patching, and capacity planning.
● Reliability & Uptime: Maintain extremely high availability. Design and execute Disaster Recovery (DR) drills, manage cross-region replication, and ensure robust backup/restoration mechanisms are in place.
● Performance Tuning: Proactively monitor database performance. Troubleshoot and resolve complex functional and latency issues in a high-throughput environment.
● Automation & Tooling: Write production-grade Python scripts and Shell automation to handle routine tasks, database migrations, and monitoring alerts.
● Cloud & Infrastructure: Manage cloud infrastructure (AWS/GCP/Azure) and lead migrations across cloud providers or regions. Write and maintain detailed runbooks and Infrastructure-as-Code (IaC) configurations.
● Production Discipline: Execute production changes with strict adherence to change management protocols to zero-out downtime risks.
Required Qualifications
● Experience: 3–7 years of relevant experience in Database Engineering or Site Reliability Engineering (SRE).
● Core Tech Stack: Deep hands-on production experience with at least two of the following: Aerospike, Cassandra, or Redis.
● Scripting: Proficiency in Python and Shell/Bash scripting for automation and tool building.
● Cloud Native: Solid experience with cloud infrastructure (AWS, GCP, or Azure) and experience migrating databases between clouds or hybrid environments.
● Operational Excellence: Proven track record of maintaining high uptime (HA) systems, managing DR strategies, and writing comprehensive runbooks.
Preferred Qualifications
● Industry Background: Previous experience in E-commerce, Fintech, or similar high-traffic, low-latency industries.
● Methodology: Experience working in Agile/Scrum environments.
● Tooling: Familiarity with monitoring tools (Prometheus, Grafana, Datadog) and Configuration Management (Ansible, Terraform)
Click on Apply to know more.