IDeaS Revenue Solutions
Website:
ideas.com
Job details:
We’re looking for strong Java engineers who’ve owned production systems and want to focus on reliability, scalability, and resilience.
This is for you if you’ve:
- Built and operated large-scale Java/JVM services
- Carried on-call and handled real production incidents
- Debugged JVM, GC, latency, and concurrency issues under pressure
- Implemented resilience patterns (circuit breakers, timeouts, graceful degradation)
What you’ll do:
- Own availability, latency, and reliability of critical services
- Improve systems through code-level reliability, not just infra
- Define SLIs/SLOs, lead incident reviews, reduce toil
- Partner with product teams to design for failure
Reliability mindset (what differentiates this role)
- Experience implementing or driving:
Circuit breakers, bulkheads, rate limiting, backpressure
Graceful degradation and fallback strategies
- Familiarity with observability concepts:
Metrics (e.g., latency percentiles, saturation)
Distributed tracing
Health checks & readiness probes
Nice to have (not mandatory):
- Exposure to SRE / Platform / Production Engineering
- Kubernetes, observability, or chaos engineering experience
📌 Software-first SRE role | Real ownership | Strong growth into reliability leadership
Click on Apply to know more.