Software Engineer, Data Infrastructure

Data Engineer · Senior · Full Time

San FranciscoUSD 185k – 385k25mo ago

Opens OpenAI's application page

Role

What you'll do.

OpenAI is seeking a Software Engineer for Data Infrastructure to build and operate large-scale data platforms powering their AI products and research. You'll work with massive Spark compute fleets, streaming systems, and exabyte-scale architecture while ensuring reliable, secure data access for machine learning and analytics workflows.

Responsibilities

Data Infrastructure Design: Design, build, and maintain distributed compute, data orchestration, and streaming infrastructure systems
Scalability Engineering: Ensure data platform can scale by orders of magnitude while maintaining reliability and efficiency
Developer Tooling: Accelerate company productivity by building excellent data tooling and systems for engineering teams
Cross-functional Collaboration: Work with product, research and analytics teams to build technical foundations for new features
Production Operations: Own reliability of systems including participation in on-call rotation for critical incidents
Full Lifecycle Ownership: Take responsibility for architecture, implementation, production operations, and monitoring

Qualifications

What we look for.

Technical

Data Infrastructure Experience
4+ years in data infrastructure engineering or infrastructure engineering with strong data interest
Big Data Platforms
Experience supporting Spark, Kafka, Flink, Airflow, Trino, or Iceberg as platforms
Infrastructure Tooling
Proficiency with infrastructure tools like Terraform for cloud resource management
Distributed Systems
Experience debugging and operating large-scale distributed systems in production
System Reliability
Track record of building and operating scalable, reliable, and secure systems

Education

Bachelor's Degree
Bachelor's degree in Computer Science, Engineering, or related technical field preferred

Experience

Production Experience
4+ years of hands-on experience with production data infrastructure systems
AI/ML Context
Experience with data infrastructure in machine learning or AI environments preferred
On-call Experience
Experience participating in on-call rotations for critical production systems

Skills

Required

Apache Spark
Hands-on experience with Spark for large-scale data processing and compute fleet management
Streaming Systems
Proficiency with Kafka, Flink, or similar high-throughput streaming platforms
Infrastructure as Code
Strong experience with Terraform and cloud infrastructure automation
Distributed Systems
Deep understanding of distributed system design, debugging, and operations
Data Storage Systems
Experience with Iceberg, Delta Lake, or similar data lake technologies

Preferred

Workflow Orchestration
Nice to have
Experience with Apache Airflow for data pipeline orchestration
Query Engines
Nice to have
Familiarity with Trino, Presto, or similar distributed SQL query engines
ML Feature Engineering
Nice to have
Knowledge of ML feature stores and tools like Chronon
Cloud Platforms
Nice to have
Experience with AWS, GCP, or Azure for large-scale data infrastructure
Monitoring & Observability
Nice to have
Experience with monitoring tools for distributed data systems

Tech stack

Languages

PythonJavaScala

Frameworks

Apache SparkApache FlinkChronon

Databases

Apache IcebergDelta LakeApache Kafka

Tools

Apache AirflowTerraformTrino

Other

KubernetesDockerAWS/GCP

Compensation

Pay and benefits.

Base·USD 185,000 – 385,000

Equity·Stock options

Benefits

Equity Package
Comprehensive equity compensation as part of total compensation package
Relocation Assistance
Full relocation assistance provided to new employees moving to San Francisco
Hybrid Work Model
Flexible hybrid work arrangement with 3 days per week in San Francisco office
Health Benefits
Comprehensive health, dental, and vision insurance coverage
Professional Development
Opportunities to work on cutting-edge AI infrastructure and learn from industry leaders

Process

Interview steps.

01
Initial Screening
Phone or video call with recruiting team to discuss background and interest
02
Technical Phone Screen
45-60 minute technical interview focusing on data infrastructure concepts and system design
03
Technical Deep Dive
Detailed technical interview covering distributed systems, data processing frameworks, and architecture design
04
System Design Interview
Design a large-scale data infrastructure system similar to OpenAI's data platform requirements
05
Onsite Interviews
Full day of interviews including technical, behavioral, and team fit assessments in San Francisco office
06
Final Interview
Discussion with senior leadership about role expectations, career goals, and cultural alignment

Full posting

Original listing.

About the Team

Data Platform at OpenAI owns the foundational data stack powering critical product, research, and analytics workflows. We operate some of the largest Spark compute fleets in production; design, and build data lakes and metadata systems on Iceberg and Delta with a vision toward exabyte-scale architecture; run high throughput streaming platforms on Kafka and Flink; provide orchestration with Airflow; and support ML feature engineering tooling such as Chronon. Our mission is to deliver reliable, secure, and efficient data access at scale and accelerate intelligent, AI assisted data workflows.

Join us to build and operate these core platforms that underpin OpenAI products, research, and analytics.

We’re not just scaling infrastructure – we’re redefining how people interact with data. Our vision includes intelligent interfaces and AI-assisted workflows that make working with data faster, more reliable, and more intuitive.

About the Role

This role focuses on building and operating data infrastructure that supports massive compute fleets and storage systems, designed for high performance and scalability. You’ll help design, build, and operate the next generation of data infrastructure at OpenAI. You will scale and harden big data compute and storage platforms, build and support high-throughput streaming systems, build and operate low latency data ingestions, enable secure and governed data access for ML and analytics, and design for reliability and performance at extreme scale.

You will take full lifecycle ownership: architecture, implementation, production operations, and on-call participation.

You’ve supported Spark, Kafka, Flink, Airflow, Trino, or Iceberg as platforms. You’re well-versed in infrastructure tooling like Terraform, experienced in debugging large-scale distributed systems, and excited about solving data infrastructure problems in the AI space.

This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.

In this role, you will:

Design, build, and maintain data infrastructure systems such as distributed compute, data orchestration, distributed storage, streaming infrastructure, machine learning infrastructure while ensuring scalability, reliability, and security
Ensure our data platform can scale by orders of magnitude while remaining reliable and efficient
Accelerate company productivity by empowering your fellow engineers & teammates with excellent data tooling and systems
Collaborate with product, research and analytics teams to build the technical foundations capabilities that unlock new features and experiences
Own the reliability of the systems you build, including participation in an on-call rotation for critical incidents

You might thrive in this role if you:

Have 4+ years in data infrastructure engineering OR
Have 4+ years in infrastructure engineering with a strong interest in data
Take pride in building and operating scalable, reliable, secure systems
Are comfortable with ambiguity and rapid change
Have an intrinsic desire to learn and fill in missing skills, and an equally strong talent for sharing learnings clearly and concisely with others

This role is exclusively based in our San Francisco HQ. We offer relocation assistance to new employees.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.

For additional information, please see OpenAI’s Affirmative Action and Equal Employment Opportunity Policy Statement.

Background checks for applicants will be administered in accordance with applicable law, and qualified applicants with arrest or conviction records will be considered for employment consistent with those laws, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act, for US-based candidates. For unincorporated Los Angeles County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations.

To notify OpenAI that you believe this job posting is non-compliant, please submit a report through this form. No response will be provided to inquiries unrelated to job posting compliance.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.

OpenAI Global Applicant Privacy Policy

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

Interview prep

5 guides for OpenAI

Apply for this position

Redirects to OpenAI's application page.

Other roles

More at OpenAI.

View all 124 roles