Spatial Front, Inc. (SFI)—a two-time USA Today Top Workplaces awardee and Washington Post Top Workplaces honoree—is seeking a Kafka Administrator to join our growing team. The selected candidate will support the administration, sustainment, security, and continuous improvement of on-premises messaging and data integration infrastructure in a secure federal environment, with a focus on PeopleSoft HCM systems.
This role centers on Apache Kafka administration across Linux and UNIX/Solaris environments. Responsibilities include ensuring platform reliability, maintaining secure operations, implementing monitoring solutions, automating processes, and optimizing system performance. The ideal candidate brings strong Kafka administration and troubleshooting expertise, proficiency in scripting, and a disciplined operational approach, along with experience supporting highly secure, compliance-driven networks.
Work Location: Hybrid, On-site - Arlington, VA
Key Responsibilities
- Administer, monitor, and sustain Apache Kafka infrastructure supporting PeopleSoft HCM and related enterprise integrations in a secure, on-premises environment.
- Install, configure, patch, upgrade, maintain, and troubleshoot Kafka brokers, clusters, topics, partitions, and supporting components across Linux and UNIX/Solaris-based environments.
- Manage topic provisioning, replication, retention, consumer groups, access controls, certificates, and secure connectivity in accordance with enterprise security and operational requirements.
- Develop and maintain scripts, automations, runbooks, and SOPs to improve platform reliability, reduce manual effort, and support repeatable administration.
- Monitor cluster health, throughput, latency, consumer lag, disk utilization, JVM performance, logs, and alerts; investigate incidents and coordinate timely resolution.
- Support platform hardening, audit readiness, change control, and compliance with security requirements in highly secured or DISA-aligned network environments.
- Support high availability, backup and recovery, capacity planning, failover readiness, and environment support across development, test, and production.
- Collaborate with application, middleware, ETL, database, infrastructure, and cybersecurity teams to support data movement, integration patterns, and stable operations.