Warner Bros. Discovery
Website:
wbd.com
Job details:
Meet Our Team
We are a platform-focused engineering team made up of Junior, Mid-level, Senior, and Staff engineers who specialize in distributed asynchronous messaging systems.
Our mission is simple but ambitious:
Make event-driven architecture safe, scalable, and easy to adopt across the company.
What defines our team:
- We treat messaging as a product, not just infrastructure.
- We believe in automation over manual operations.
- We design multi-region, high-throughput, and mission-critical workloads.
- We invest in documentation, internal enablement, and reusable frameworks.
- We balance platform reliability with developer experience.
This is a team where you can deepen your expertise in Kafka, cloud infrastructure, and platform engineering — while influencing architecture across the organization.
About the Role
We are looking for a Software Engineer II to join our Kafka Platform team and help build and scale our Kafka-as-a-Service offering on top of Confluent Cloud.
You will contribute to the design, automation, and evolution of a global event streaming platform that powers high-traffic telemetry, business-critical messaging, and large-scale data processing workloads.
This role is ideal for an engineer who combines strong software development fundamentals with hands-on infrastructure and cloud experience, and who enjoys solving distributed systems challenges at scale.
What You’ll Do
- Contribute to the development and evolution of our Kafka-as-a-Service platform built on Confluent Cloud.
- Design and implement scalable, resilient, and secure Kafka architectures across multiple regions.
- Build and maintain Infrastructure-as-Code (IaC) workflows using Terraform.
- Improve self-service capabilities for internal teams (topic provisioning, schema management, access control, etc.).
- Solve complex challenges around:
- Scalability & throughput
- Multi-region replication
- Schema Registry management
- Security & compliance
- High availability & disaster recovery
- Participate in on-call rotations and support production systems.
- Develop automation and tooling in Python, Java, or Go to improve reliability and developer experience.
- Contribute to monitoring and platform observability.
What We’re Looking For
- Bachelor's degree in computer science or related field (or equivalent experience).
- 3–5 years of professional software development experience.
- Hands-on experience with Kafka administration or operations (preferably Confluent Cloud).
- Experience working with AWS cloud infrastructure.
- Practical experience with Terraform or another Infrastructure-as-Code tool.
- Strong programming experience in at least one of the following: Python, Java, or Go.
- Experience operating and supporting production systems.
- Strong understanding of software engineering best practices:
- Testing & automation
- Code reviews
- CI/CD
- Observability & monitoring
- Strong computer science fundamentals (data structures, algorithms, distributed systems basics).
Preferred Qualifications
- Operating Kafka at scale, including:
- Topic configuration & partitioning strategy
- Consumer group optimization
- Schema evolution & governance
- Replication strategies
- Event-driven architecture and streaming frameworks (e.g., Kafka Streams).
- Cloud infrastructure automation (AWS preferred).
- Database systems (Postgres, NoSQL, etc.).
- Experience participating in on-call rotations and incident responses.
- Strong written communication skills, including architecture documentation and diagramming.
- Experience collaborating with globally distributed teams.
What Makes You Successful in This Role
Success in this role means growing from a strong contributor into a reliable platform engineer who can independently deliver meaningful improvements.
You will thrive in this role if you:
- Own Problems End-to-End
- Take well-defined problems and drive them through design, implementation, testing, and production rollout.
- Follow through on operational excellence — not just writing code, but ensuring it runs reliably in production.
- Proactively identify inefficiencies and suggest improvements.
- Build with Production in Mind
- Think about scalability, resiliency, observability, and failure modes while designing solutions.
- Write clean, testable, and maintainable code.
- Value automation and prefer infrastructure-as-code over manual processes.
- Develop Distributed Systems Intuition
- Understand asynchronous patterns and trade-offs (throughput, ordering, retries, backpressure).
- Ask thoughtful questions about partitioning strategy, replication, and consumer scaling.
- Continuously deepen your understanding of Kafka and cloud-native systems.
- Elevate the Team Around You
- Participate actively in code reviews and technical discussions.
- Contribute to improving documentation and shared standards.
- Support teammates during incidents and learn from postmortems.
- Operate with Accountability
- Take ownership of the services and components you build.
- Participate in on-call rotations responsibly.
- Learn from production issues and help prevent repeat incidents.
Click on Apply to know more.