THE ROLE
The Digital Experience team at Tesla is at the forefront of building a consistent, global customer experience across all digital touchpoints – web, native apps and in-car infotainment systems. This team is building the next generation products and supporting infrastructure which allows our customers, living in 60+ countries, to understand, interact and purchase our amazing vehicles and energy products.
As an DevOps/SRE on our team you'll be responsible for automating the deployment, configuration, management, and ensuring our integrated systems are working as expected by implementing proper monitoring tools and techniques to detect issues before they become a production problem.
Responsibilities
- Design and develop ERP and customer engagement software tools and applications.
- Integrate software with various types of equipment, systems, and people.
- Assist Software, Controls, Manufacturing, and other engineering teams with onboarding applications.
- Support vendor applications, automate configuration and deployment of services.
- Ensure best practices and observability of the Ignition, such as metrics, logging, tracing, and alerting.
- Monitor performance and provide recommendation for improvements, continuous improvement is key.
- Author technical documentation for workflows/processes/best practices.
Requirements
Must Qualifications
- 7+ years as Software Development/DevOps/Site Reliability Engineer (SRE).
- 5+ years in a high-level language Golang.
- Willing to take on C#/.NET and become a professional C#/.NET developer
- Working knowledge of CI/CD pipeline experience (GitHub Enterprise).
- Working knowledge of Docker, and Kubernetes.
- Working knowledge of the Linux OS.
- Working knowledge of Networking (TCP/IP and Application).
Preferred Qualifications
- Working knowledge of RESTful API, OAuth2, JWT, Encryption/Decryption
- Working knowledge of Secrets management (Hashicorp Vault, etc.).
- Working knowledge of MySQL, Redis, Kafka, RabbitMQ, Elasticsearch
- Understand and develop the concepts of Observability and Infrastructure as Code (Prometheus, AlertManager, Grafana, Splunk, InfluxDB, …).
- Willing to mentor other team members and engineers with less SRE/IT type knowledge.
- Comfortable doing live troubleshooting of issues on NOC bridges/outage calls.