Snowflake

Senior Production Engineer

Snowflake1 months ago
Location

US-CA-Menlo Park

Type

Full Time

Salary

USD 200,000 – 287,500

Level

Senior

Role

Site Reliability Engineer

Posted

Feb 10, 2026

Full TimeSenior

The role

Summary

Snowflake is seeking a Senior Production Engineer to join their Production Engineering Team, responsible for driving reliability tools and processes that ensure top-tier customer experiences. This role involves end-to-end production reliability management, from proactive issue prevention to rapid detection and efficient resolution of problems in large distributed systems.

What you'll do

Service Lifecycle Management: Engage in and improve the whole lifecycle of services from inception and design through deployment, operation, and refinement
System Scaling and Automation: Scale systems sustainably through automation and drive changes that improve reliability and velocity
Incident Response Leadership: Establish and practice low noise incident response rotations and conduct blameless postmortems to prevent problem recurrence
Code Development and Review: Write and review code, develop documentation and capacity plans, and debug complex problems on large distributed systems
SLO Collaboration: Collaborate with software engineers to establish, maintain, and optimize functional and performance Service Level Objectives
On-call Operations: Participate in a 12x7 on-call rotation to ensure continuous system reliability and rapid incident response

What we look for

Technical

Programming ProficiencyProficient in at least one modern programming language for system development and automation
Problem-Solving MethodologySystematic problem-solving methods and approaches to complex technical challenges
Distributed Systems ExperienceExperience debugging and optimizing performance in large distributed systems environments

Education

Bachelor's DegreeBachelor's degree in Computer Science, related technical field involving software engineering, or equivalent practical experience

Experience

Production SystemsExperience with production-level distributed systems and reliability engineering practices
Cloud InfrastructureHands-on experience with public cloud providers (AWS, Azure, or GCP) for scalable infrastructure deployment
Container OrchestrationExperience with containers and container orchestration systems such as Kubernetes
Linux InfrastructureExperience deploying, managing, and operating scalable and fault-tolerant Linux infrastructure

Skills

Required skills

Modern Programming LanguagesProficiency in at least one modern programming language such as Python, Go, Java, or similar
Systematic Problem-SolvingStrong analytical and systematic approach to troubleshooting complex distributed systems
Communication SkillsEffective written and verbal communication for cross-team collaboration and documentation

Nice to have

Load Testing ExperienceExperience with capacity and load testing of distributed applications and systems
Kubernetes and ContainersHands-on experience with containers and container orchestration systems like Kubernetes
SLO-Driven ProcessesExperience with Service Level Objective-driven reliability management processes
Public Cloud PlatformsPractical experience with AWS, Azure, or Google Cloud Platform for infrastructure management
Linux InfrastructureExperience in deploying, managing, and operating scalable Linux-based infrastructure
Independent WorkAbility to prioritize tasks effectively and work independently in fast-paced environments

Compensation & benefits

Salary

USD 200,000 – 287,500 (annual)

Stock options

Available

Benefits

Comprehensive Health Benefits

Full medical, dental, and vision coverage for employees and eligible dependents

Equity Participation

Stock options and equity participation in company growth and success

Professional Development

Opportunities for continuous learning, skill development, and career advancement in cloud computing

Work-Life Balance

Flexible working arrangements and time-off policies to support personal well-being

Innovation Culture

Culture focused on impact, innovation, and collaboration with cutting-edge technology


Interview process

  1. 1
    Initial Screening Phone or video screening with recruiter to discuss background, experience, and interest in production engineering
  2. 2
    Technical Phone Interview Technical discussion covering system design, debugging scenarios, and programming concepts relevant to distributed systems
  3. 3
    System Design Interview Deep-dive into designing scalable, reliable systems with focus on SLOs, monitoring, and incident response
  4. 4
    Coding Interview Programming exercise focusing on automation, problem-solving, and code quality for production environments
  5. 5
    Final Round Interview Panel interview with team members covering cultural fit, scenario-based questions, and discussion of production engineering practices

Apply for this position

You'll be redirected to the company's application page


Snowflake

Snowflake

View all jobs

Snowflake is an American cloud computing company offering data warehousing and analytics platforms.

Bozeman, Montana, United StatesFounded 2012snowflake.com

Tech Stack

Languages
Modern Programming LanguagesPythonGoJava
Frameworks
KubernetesDocker
Databases
Snowflake Data PlatformDistributed Database Systems
Tools
Monitoring and Observability ToolsLoad Testing FrameworksInfrastructure as Code
Other
AWSAzureGoogle Cloud PlatformLinux

Interview Guides

11 guides available for Snowflake

Apply Now