About the role
Join us as we pursue our disruptive new vision to make machine data accessible, usable and valuable to everyone via AI. We are a company filled with people who are passionate about our product and seek to deliver the best experience for our customers. At Splunk, we're committed to our work, customers, having fun and most importantly to each other's success. Learn more about Splunk careers and how you can become a part of our journey!
Role
As the Software Engineer, AI Platform in the Splunk AI Dev Tools team, you will be building services and tools that underlie our AI products and also accelerate our AI developers. This role combines traditional software engineering with modern cloud infrastructure and machine learning technologies. The ideal candidate will help build and maintain scalable AI systems while ensuring robust deployment and operational excellence.
Responsibilities
Design, develop, and maintain AI services in our AI platform that powers all our AI products.
Implement and optimize CI/CD pipelines for machine learning model deployment
Architect and manage cloud infrastructure using Infrastructure as Code principles
Collaborate with ML Engineers and Applied Scientists to build efficient model serving systems
Monitor system performance and implement improvements for scalability
Participate in code reviews and technical documentation
Troubleshoot production issues and implement solutions
Required Qualifications
US Citizenship
Bachelor's degree in Computer Science, Software Engineering, or related field
3+ years of experience in software development using languages such as Python and Golang
Strong experience with cloud platforms (AWS, GCP, or Azure)
Proficiency in containerization technologies (Docker, Kubernetes)
Experience with CI/CD tools (Jenkins, GitLab CI, or similar)
Knowledge of Infrastructure as Code (Terraform, CloudFormation, or similar)
Understanding of RESTful APIs
Knowledge of data storage technologies including cloud object stores (e.g. S3), databases (e.g. postgreSQL), vector databases
Knowledge of monitoring and observability tools (Prometheus, Grafana, Splunk)
Experience with version control systems (Git) and collaborative development workflows
Preferred Qualifications
Experience with machine learning frameworks (TensorFlow, PyTorch)
Understanding of ML deployment patterns and model serving
Familiarity with MLOps practices and tools
Experience with automated testing and quality assurance
Knowledge of security best practices in cloud environments
Soft Skills
Strong problem-solving and analytical abilities
Excellent written and verbal communication
Ability to work independently and in a team environment
Strong documentation skills
Proactive approach to identifying and solving problems
Ability to learn and adapt to new technologies quickly