About the role
The Tooling Team is responsible for creating and maintaining the infrastructure, tools, and resources that enable CAST AI product teams to develop, deploy, and operate software effectively. They provide the critical foundation on which product teams build, helping to speed up development and ensure reliability.
The team focuses on designing scalable, high-performance systems that meet the needs of engineering teams and also ensures internal company security requirements are covered. The team's goal is to build and maintain tooling and platforms that minimize friction for developers, enabling them to ship features quickly while addressing technical challenges such as performance, availability and improving developer experience.
As the company scales, the Tooling Team ensures the platform can support a growing and more complex organization and the evolving needs of the organization. Currently, the team is responsible for the observability platform, CI/CD pipelines and the local development setup, with the scope of ownership expected to expand as the company and the Tooling Team grow.
Responsibilities:
Improve develop experience metrics: use developer experience measurements to identify where developer experience is suffering the most and create custom internal tooling to improve these metrics in a measurable way.
Improve continuous Integration and Delivery: manage and optimize CI/CD pipelines using tools like GitLab Pipelines, GitHub Actions, ArgoCD, and Helm, ensuring efficient and reliable deployment processes.
Development environment management: enable other engineering teams by maintaining and extending the existing local development tooling managed by Tilt.
Oversee incident management systems: integrate with incident management and alerting tools such as Opsgenie, Pagerduty, or similar to enhance our response capabilities and reduce downtime.
Improve internal company security posture: manage and improve the security posture of our Kubernetes clusters and other internal components, by defining and enforcing security policies, while making sure to keep the development experience great by developing components that enable developers to be secure by default.
Requirements:
Proficiency with Go
Have a passion for automation and developer tooling
Strong problem-solving skills and the ability to troubleshoot complex issues in a production environment
An understanding of the latest security principles, techniques, and protocols, especially in and around Kubernetes
A "yes we can" attitude
Ability to work independently or with a group
Strong written and verbal communication skills in English
You have to be physically in any of the European countries GMT 0 to GMT +3.
About the company
CAST AI is the leading Kubernetes automation platform for AWS, GCP and Azure customers. The company is on a mission to deliver a fully automated Kubernetes experience. What's unique about CAST AI is that its platform goes beyond monitoring clusters and making recommendations; it utilizes advanced machine learning algorithms to analyze and automatically optimize clusters, saving customers 50% or more on their cloud spend, improving performance and reliability, and boosting DevOps and engineering productivity.