Tesla
Website:
tesla.com
Job details:
About The Team
Engineering Tools owns and operates the on-prem developer platforms that every Tesla engineer depends on every day: GitHub Enterprise, JFrog Artifactory, GitHub Copilot (self-hosted), Cursor (on-prem), and the Atlassian suite (Jira Service Management + Confluence). We also run the AI-augmented support layer that fronts these platforms - a Mattermost support bot backed by our internal Nabu RAG platform, observability via Open Telemetry, and a GitOps-driven Kubernetes deployment footprint in our cluster.
If one of our systems is down, thousands of Tesla engineers stop shipping. We're hiring a Staff SRE to own the reliability, scalability, and operational maturity of that footprint.
Key Responsibilities
Platform administration: Manage GitHub Enterprise (Cloud and/or Server) organizations, teams, repos, branch protection rules, Actions runners, and Apps. Administer JFrog Artifactory repositories (local, remote, virtual), permissions, replication, and storage policies. User support: Triage and resolve tickets covering access requests, repo migrations, build/artifact failures, authentication issues, and integrations. Define and meet SLAs. Migrations & onboarding: Lead repo migrations into/out of GitHub (e.g., GitHub Migrations API, gh-migration tooling) and Artifactory repository imports/exports. Onboard new teams with templates and
standards.
Automation: Build scripts and tooling (Bash, Python, Terraform, GitHub Actions, JFrog CLI) to automate provisioning, permission audits, cleanup, and reporting. Eliminate repetitive support work. Reliability & monitoring: Monitor platform health, storage usage, runner capacity, and license consumption. Coordinate upgrades, patches, and incident response with the vendor. Security & compliance: Enforce SSO/SAML, SCIM provisioning, secret scanning, signed commits, audit logging, and least-privilege access. Support SOC 2 / ISO audits. Integrations: Maintain integrations with CI/CD (Jenkins, GitHub Actions, GitLab CI), SAST/SCA scanners, Jira, Slack, and internal developer portals. Documentation & enablement: Write runbooks, FAQs, and self-service guides. Host office hours and training sessions for
developers.
Required Qualifications
3+ years administering GitHub Enterprise (Cloud or Server) at scale (500+ users or 1000+ repos). 2+ years administering JFrog Artifactory (or comparable: Nexus, Cloudsmith, Harbor). Strong scripting in Bash and Python; comfortable with REST APIs and curl/jq. Working knowledge of Git internals (refs, packfiles, LFS, submodules) and ability to debug repo corruption, large-file issues, and merge problems. Hands-on experience with at least one CI/CD system (GitHub Actions, Jenkins, GitLab CI, CircleCI). Familiarity with SSO/SAML, SCIM, OIDC, and personal/fine-grained access tokens. Excellent written communication - you can turn a confusing incident into a clear postmortem and a vague ticket into a fixable problem.
Preferred Qualifications
Experience with GitHub Migrations API, gh-migration-tool, or gei (GitHub Enterprise Importer). Experience operating Artifactory in HA mode, with S3/blob storage, and Xray for vulnerability scanning. Infrastructure-as-Code: Terraform providers for GitHub and Artifactory. Container/package format expertise: Docker, npm, Maven, PyPI, Helm, Conan. Familiarity with secret scanning tools (GitHub Advanced Security, GitGuardian, TruffleHog) and dependency management
(Dependabot, Renovate).
Prior on-call or production support experience. Exposure to GHAS, Copilot for Business, or Copilot Enterprise rollouts.
Bonus
Experience operating self-hosted LLM inference (Copilot Enterprise, on-prem Cursor backend, vLLM, or similar), RAG pipelines, or vector databases.
Soft Skills
Excellent written communication - you can write a post-mortem that engineering leadership reads to the end, and a runbook that a junior on-call can execute at 3 AM. Strong technical influence without authority; you raise the reliability bar across teams by example and through reviews, not by mandate. Calm under pressure during sev-1 incidents affecting thousands of engineers.
Education
Bachelor's degree in Computer Science, Engineering, or related field or equivalent professional experience.
Why This Role Is Different
Customer = every Tesla engineer. Your platforms unblock Vehicle Software, Autopilot, Energy, and Manufacturing teams. The impact of every reliability improvement compounds across the company. On-prem by design. We don't outsource our critical paths to SaaS. You'll own the full stack - hardware, network, OS, platform, application, observability - and you'll have the authority to change it. AI-augmented support. We're not just operating platforms; we're building the AI tooling (Nabu RAG + Mattermost support bot + Copilot/Cursor integrations) that lets a small SRE team serve a very
large engineering org. You'll help shape that.
High autonomy, high ownership. Engineering Tools is small and senior-heavy. As a Staff SRE you'll set technical direction for multiple platforms - not just execute someone else's roadmap.
Click on Apply to know more.