OpenAI

Software Engineer, Delivery / CD

OpenAI5 days ago
Location

San Francisco

Type

Full Time

Salary

USD 210,000 – 490,000

Level

Senior

Role

DevOps Engineer

Posted

Mar 10, 2026

Full TimeSenior

The role

Summary

OpenAI is seeking a Software Engineer for their Delivery/CD team to build and operate continuous deployment systems that safely ship infrastructure and product code to production. This role focuses on creating deployment platforms, release pipelines, and rollout safety mechanisms to enable rapid, low-risk deployments across dozens of Kubernetes clusters and global regions.

What you'll do

Continuous Deployment Infrastructure: Design and build CD systems that safely deploy changes across dozens of Kubernetes clusters and global regions
Progressive Delivery Systems: Develop canary releases, staged rollouts, and automated rollback mechanisms for safe production deployments
Engineering Velocity Optimization: Reduce friction in release pipelines and automate manual operational workflows to improve development speed
Cross-Team Collaboration: Partner with product and infrastructure teams to ensure services are deployable, observable, and resilient at scale
Deployment Methodology Evolution: Implement and advance GitOps, infrastructure-as-code, and progressive delivery patterns across the organization
Automated Health Monitoring: Build systems that evaluate deployment health using metrics, logs, traces, and alerts to detect regressions and trigger rollbacks
AI-Assisted Deployment Workflows: Develop systems supporting agent-assisted or autonomous deployment processes using modern AI tooling
Platform Reliability: Maintain high availability and performance of deployment infrastructure serving hundreds of engineers
Security Integration: Implement security scanning, compliance checks, and vulnerability assessments in deployment pipelines
Documentation and Training: Create comprehensive documentation and training materials for deployment platform adoption across engineering teams

What we look for

Technical

Kubernetes Expertise5+ years experience with Kubernetes-based deployment systems at enterprise scale
Continuous Deployment PlatformsProven experience building or operating CD platforms serving multiple development teams
GitOps ProficiencyDeep familiarity with GitOps tooling such as ArgoCD, Flux, or similar systems
Infrastructure as CodeExpert-level experience with Terraform, Ansible, or similar IaC tools for cloud infrastructure management
Programming LanguagesStrong proficiency in Python, Go, or similar languages for building automation and platform tools
Cloud PlatformsExtensive experience with AWS, GCP, or Azure for large-scale infrastructure deployment
Monitoring and ObservabilityExperience implementing comprehensive monitoring, logging, and alerting systems for production deployments
Container TechnologiesDeep understanding of Docker, container orchestration, and cloud-native architecture patterns

Education

Degree RequirementBachelor's degree in Computer Science, Engineering, or equivalent practical experience preferred

Experience

Platform Engineering Experience5-8 years in platform engineering, DevOps, or site reliability engineering roles
Large-Scale SystemsExperience operating deployment systems serving 100+ engineers and thousands of deployments daily
Production OperationsStrong background in production incident response, rollback procedures, and operational safety
Cross-Functional LeadershipExperience working with diverse engineering teams to improve deployment practices and developer productivity

Skills

Required skills

KubernetesExpert-level experience with Kubernetes cluster management, networking, and large-scale deployments
PythonAdvanced Python programming skills for building automation tools and platform services
GitOpsDeep understanding of GitOps principles and hands-on experience with ArgoCD or Flux
Infrastructure as CodeProficiency with Terraform for managing cloud infrastructure and Kubernetes resources
CI/CD PipelinesExperience designing and implementing complex continuous integration and deployment pipelines
Monitoring and AlertingSkills in implementing comprehensive observability using Prometheus, Grafana, and similar tools
Container OrchestrationAdvanced knowledge of Docker, container lifecycle management, and orchestration patterns
Cloud PlatformsHands-on experience with AWS, GCP, or Azure for production-grade infrastructure

Nice to have

Go ProgrammingExperience with Go for building high-performance infrastructure tools and Kubernetes operators
Service MeshKnowledge of Istio, Linkerd, or similar service mesh technologies for advanced traffic management
Helm ChartsExperience creating and maintaining Helm charts for complex application deployments
Machine Learning OperationsUnderstanding of MLOps practices and deploying machine learning models at scale
Security ToolingExperience with security scanning tools, vulnerability management, and compliance automation
Incident ResponseBackground in production incident management, post-mortem processes, and reliability engineering
AI/ML InfrastructureInterest or experience in AI model deployment, GPU cluster management, and ML infrastructure
Open Source ContributionsActive contributions to CNCF projects, Kubernetes ecosystem, or other relevant open source projects

Compensation & benefits

Salary

USD 210,000 – 490,000 (annual)

Stock options

Available

Benefits

Equity Compensation

Significant equity package in one of the world's leading AI companies with strong growth potential

Health Insurance

Comprehensive medical, dental, and vision insurance coverage for employees and families

Mental Health Support

Access to mental health resources, counseling services, and wellness programs

Professional Development

Conference attendance, training budget, and opportunities to work with cutting-edge AI technologies

Flexible Work Arrangements

Hybrid work options with modern office facilities in San Francisco

Retirement Benefits

401(k) plan with company matching and comprehensive retirement planning resources

Parental Leave

Generous parental leave policies supporting new parents with extended time off

Technology Stipend

Equipment and technology allowances for optimal home office setup

Learning Budget

Annual budget for books, courses, certifications, and skill development in AI and engineering

Commuter Benefits

Transportation subsidies and parking allowances for San Francisco office


Interview process

  1. 1
    Initial Screening 30-minute phone call with recruiter covering background, motivation, and basic technical fit for the role
  2. 2
    Technical Phone Screen 45-minute technical interview focusing on Kubernetes, CI/CD concepts, and system design fundamentals
  3. 3
    System Design Interview 60-minute deep-dive into designing a large-scale continuous deployment system with focus on safety and scalability
  4. 4
    Technical Deep Dive 90-minute hands-on session covering GitOps implementation, infrastructure as code, and deployment pipeline design
  5. 5
    Behavioral and Culture Fit 45-minute interview focusing on collaboration, problem-solving approach, and alignment with OpenAI's mission
  6. 6
    Final Round Panel Series of conversations with team members and leadership covering technical expertise and team integration
  7. 7
    Reference Checks Verification of previous work experience and technical contributions with former colleagues and managers

Apply for this position

You'll be redirected to the company's application page