Cursor

Software Engineer, ML Infrastructure

Cursor1 months ago
Location

SF / NY

Type

Full Time

Salary

USD 160,000 – 250,000

Level

Senior

Role

ML Infrastructure Engineer

Posted

Jan 27, 2026

Full TimeSenior

The role

Summary

Cursor is seeking a Software Engineer to join their ML Infrastructure team, working on large-scale compute, storage, and software infrastructure to support AI-powered coding model development. The role involves collaborating with ML researchers to build high-performance GPU clusters, improve training throughput, and develop systems for workload scheduling and data movement in both SF and NY offices.

What you'll do

ML Training Infrastructure: Collaborate with ML researchers to improve throughput and reliability of large-scale model training
GPU Cluster Management: Work with OEMs and cloud providers to plan and build cutting-edge GPU infrastructure
Compute Scalability: Improve density and scalability of compute environments for increasingly large RL workloads
Infrastructure Automation: Create software and systems to automate building, monitoring, and running GPU clusters
Workload Orchestration: Build workload scheduling and data movement systems to support growing training footprint
System Reliability: Ensure high availability and performance of distributed ML infrastructure
Cross-functional Collaboration: Work closely with ML engineers and researchers to enable their work through infrastructure improvements

What we look for

Technical

Systems ProgrammingStrong background in systems and infrastructure-focused software engineering
Programming LanguagesProficiency in Python, TypeScript, Rust, and Go for infrastructure development
Distributed SystemsExperience with distributed storage and networking infrastructure
Linux AdministrationDeep knowledge of Linux systems across cloud and bare metal environments
Large-scale InfrastructureExposure to systems managing thousands of nodes with significant resource footprints
Infrastructure as CodeProduction experience with IaC and configuration management across hosts and Kubernetes

Education

Computer Science DegreeBachelor's or Master's degree in Computer Science, Engineering, or related technical field
Systems Engineering BackgroundStrong foundation in computer systems, networking, and distributed computing

Experience

Infrastructure Engineering5+ years of experience in large-scale infrastructure development
Distributed SystemsProven experience building and maintaining distributed computing systems
ML InfrastructureExperience supporting machine learning workloads and training pipelines
Cloud and Bare MetalExperience with both cloud platforms and bare metal infrastructure management

Skills

Required skills

PythonAdvanced proficiency for ML infrastructure and automation
TypeScriptStrong skills for developer tooling and web interfaces
RustSystems programming for high-performance infrastructure
GoDistributed systems and cloud-native development
Linux SystemsDeep knowledge of Linux administration and networking
Distributed StorageExperience with large-scale storage systems
KubernetesContainer orchestration and cluster management
Infrastructure as CodeTerraform, Ansible, or similar tools for automation

Nice to have

NVIDIA GPU OperationsExperience with Blackwell and Hopper-class GPU hardware
InfiniBand/RoCEHigh-performance networking for GPU clusters
Ray FrameworkDistributed computing for ML workloads
SLURMWorkload management and job scheduling
ML Training PipelinesExperience optimizing machine learning training workflows
Bare Metal InfrastructureDirect hardware management and optimization

Compensation & benefits

Salary

USD 160,000 – 250,000 (annual)

Stock options

Available

Benefits

Equity Package

Significant equity stake in a fast-growing AI company

Health Insurance

Comprehensive medical, dental, and vision coverage

In-Person Culture

Cozy offices in North Beach SF and Manhattan with well-stocked libraries

Learning Resources

Access to cutting-edge research and development in AI/ML

Flat Organization

Direct impact and minimal bureaucracy in a talent-dense team

Professional Development

Opportunity to work with state-of-the-art ML infrastructure


Interview process

  1. 1
    Initial Screen Phone/video call with hiring manager to discuss background and role alignment
  2. 2
    Technical Interview Systems design and infrastructure architecture discussion with engineering team
  3. 3
    Coding Assessment Live coding session focused on systems programming and distributed systems
  4. 4
    Cultural Fit Conversation about values, working style, and team collaboration
  5. 5
    Final Interview Leadership interview covering career goals and company vision alignment
  6. 6
    Reference Check Verification of experience and performance with previous employers

Apply for this position

You'll be redirected to the company's application page


Cursor

Cursor

View all jobs

Built to make you extraordinarily productive, Cursor is the best way to build software with AI.

San Francisco, California, United StatesFounded 2021cursor.com

Tech Stack

Languages
PythonTypeScriptRustGo
Frameworks
RayKubernetes
Databases
Distributed Storage Systems
Tools
SLURMInfrastructure-as-CodeConfiguration ManagementNVIDIA GPUsInfiniBand/RoCE
Other
Linux SystemsCloud PlatformsBare Metal Infrastructure
Apply Now