Snowflake

Staff Software Engineer - FDB Platform

Snowflake3 days ago
Location

US-CA-Menlo Park

Type

Full Time

Salary

USD 236,000 – 339,250

Level

Staff

Role

Staff Engineer

Posted

Jun 30, 2026

Full TimeStaff

The role

Summary

Staff Software Engineer role at Snowflake focused on designing and implementing scalable distributed systems for the FDB (FoundationDB-based) platform that powers the Snowflake Data Cloud across AWS, Azure, and GCP. This position requires 8+ years of infrastructure experience with deep expertise in distributed systems, container orchestration, and large-scale database technologies to architect cloud-agnostic solutions for autoscaling, self-healing clusters, and cost-optimized operations.

What you'll do

Design Scalable Distributed System Solutions: Architect and design cloud-agnostic distributed system solutions for the FDB platform infrastructure, considering multi-cloud deployment across AWS, Azure, and GCP with focus on elastic scalability, fault tolerance, and operational simplicity
Solve Fault-Tolerance and High Availability Challenges: Analyze complex fault-tolerance scenarios and high availability requirements across distributed infrastructure, implement robust solutions that prevent single points of failure and ensure continuous service availability at scale
Address Performance and Scale Challenges: Identify performance bottlenecks and scalability limitations in the FDB platform infrastructure, conduct thorough analysis, implement optimizations, and validate improvements through testing and production monitoring
Own End-to-End Project Delivery: Take complete ownership of complex infrastructure projects from problem identification and solution design through implementation, comprehensive testing, performance validation, and safe production rollout with minimal operational risk
Build Consistency-Aware Solutions: Understand and apply nuanced trade-offs between consistency guarantees, durability requirements, and cost implications to build solutions that meet the demands of rapidly growing Snowflake services while optimizing operational expenses
Develop Next-Generation Transaction and Storage Systems: Build advanced transaction processing systems, intelligent caching layers, optimized storage engines, and multi-tenant isolation capabilities that power Snowflake's expanding product portfolio
Evangelize Database Best Practices: Share expertise across the organization regarding distributed database usage patterns, end-to-end system architecture principles, and operational excellence to elevate engineering practices company-wide
Instrument and Debug Production Systems: Implement comprehensive instrumentation and observability across FDB platform components, diagnose complex production issues through systematic analysis, and develop solutions that address root causes and improve system resilience
Drive Infrastructure Automation: Develop and enhance self-managing infrastructure capabilities including autoscaling based on utilization and traffic patterns, automatic cluster provisioning with zero manual intervention, and self-healing mechanisms that prevent or mitigate production impact
Optimize Operational Cost Efficiency: Design and implement self-optimizing systems that ensure FDB clusters run at optimal resource utilization, minimize cost-of-goods-sold (COGS), and maintain efficiency as workload patterns evolve

What we look for

Technical

Large-Scale Distributed Systems DesignProven expertise designing, building, and operating production distributed systems infrastructure at scale, with demonstrated understanding of trade-offs between consistency, durability, availability, and cost
Container Orchestration MasterySolid understanding of Kubernetes, Mesos, OpenShift, or equivalent container platforms, including internal architecture, scheduling algorithms, networking models, and operational patterns
Systems ProgrammingFluency in Java with strong foundation in multi-threading, concurrency primitives, memory management, and performance optimization for high-throughput distributed systems
Key-Value Store ExpertisePractical experience designing, deploying, or operating scalable key-value stores such as FoundationDB, RocksDB/LevelDB, DynamoDB, Redis, or Cassandra in production environments
Operating Systems KnowledgeDeep understanding of kernel concepts including multi-threading models, memory management strategies, networking stack internals, storage I/O optimization, and performance profiling
Cloud-Native ArchitectureExperience building and managing distributed systems across multiple cloud providers (AWS, Azure, GCP) with understanding of cloud-agnostic design principles and multi-cloud orchestration

Education

Bachelor's Degree in Computer ScienceRequired: BS in Computer Science or equivalent practical experience demonstrating advanced systems knowledge
Advanced Degree PreferredPreferred: Master's degree or PhD in Computer Science, distributed systems, or related technical field that demonstrates deepened expertise in algorithms, systems design, and theoretical foundations

Experience

8+ Years Infrastructure EngineeringMinimum eight years of industry experience designing, building, deploying, and supporting large-scale infrastructure systems in production environments with responsibility for system reliability and performance
Stateful Service OperationsSignificant hands-on experience designing, implementing, and operating large-scale distributed systems infrastructure that manages stateful services with high availability requirements
Complex Project DeliveryDemonstrated track record of owning and delivering highly complex projects in the distributed systems space from conception through design, implementation, testing, and production rollout
Big Data and Storage TechnologiesProfessional experience with big data storage technologies, distributed file systems (HDFS), columnar databases, or related data infrastructure platforms

Skills

Required skills

Distributed Systems ArchitectureDesign and implementation of fault-tolerant, highly available distributed systems with expertise in consensus algorithms, replication strategies, and failure recovery mechanisms
Java ProgrammingProduction-level systems programming in Java with deep expertise in concurrency, multi-threading, performance optimization, and low-latency system design
Kubernetes or Container OrchestrationDeep knowledge of container orchestration platforms for cluster management, resource scheduling, service discovery, and automated workload orchestration
Database Systems KnowledgeUnderstanding of distributed database internals, transaction processing, consistency models, durability mechanisms, and multi-tenancy architecture
Cloud Infrastructure OperationsHands-on experience deploying and managing distributed systems on public cloud platforms including AWS, Azure, or GCP with understanding of cloud-native operational patterns
Performance Analysis and OptimizationAbility to identify bottlenecks, instrument systems for observability, analyze performance metrics, and implement optimizations for scale and efficiency

Nice to have

FoundationDB ExperiencePrior hands-on experience with FoundationDB architecture, operational characteristics, or contribution to FoundationDB ecosystem highly valued
Multi-Cloud ArchitectureExpertise building and operating systems across multiple cloud providers with understanding of cloud-agnostic abstraction patterns and provider-agnostic infrastructure-as-code
Autoscaling and Self-Healing SystemsDesign and implementation experience building self-managing infrastructure with auto-scaling capabilities, failure detection, and self-healing mechanisms
Big Data TechnologiesExperience with HDFS, Cassandra, columnar databases, or other big data storage technologies and their distributed operation at scale
Go or C++ Systems ProgrammingAdditional expertise in systems languages like Go or C++ for performance-critical infrastructure components
Infrastructure-as-CodeProficiency with Terraform, CloudFormation, or similar infrastructure automation tools for declarative infrastructure management
Observability and MonitoringExperience building comprehensive monitoring, logging, and distributed tracing systems for production distributed systems

Compensation & benefits

Salary

USD 236,000 – 339,250 (annual)

Stock options

Available

Benefits

Equity Compensation

Stock options providing ownership stake and long-term value participation in Snowflake's continued growth as a high-growth cloud computing company

Comprehensive Health Insurance

Medical, dental, and vision coverage for employees and eligible dependents

401(k) Retirement Plans

Tax-advantaged retirement savings plans with company match contributions

Professional Development

Learning and development budgets to support continued technical growth, conference attendance, and skill development in distributed systems and cloud technologies

Flexible Work Environment

Flexibility to work remotely or in office with support for work-life balance as a high-growth technology company culture

Collaborative Innovation Culture

Environment built on impact, innovation, and collaboration where technical excellence is valued and challenging problems drive career advancement


Interview process

  1. 1
    Initial Screening Recruiter phone screen to assess background, experience with distributed systems, and alignment with Staff-level expectations for ownership and complexity handling
  2. 2
    Technical Phone Screen Senior engineer conversation exploring distributed systems design thinking, specific experience with key-value stores or database infrastructure, and approach to solving complex architectural challenges
  3. 3
    System Design Interview In-depth technical interview requiring you to design a large-scale distributed system component, articulate trade-offs (consistency vs. availability vs. cost), explain failure modes, and discuss operational considerations
  4. 4
    Deep Dive Technical Assessment Discussion of past complex infrastructure project you owned, focusing on design decisions, challenges overcome, lessons learned, and how you would approach similar problems differently with current knowledge
  5. 5
    Infrastructure Knowledge Assessment Technical evaluation of expertise in Kubernetes or similar container orchestration platforms, demonstrating understanding of scheduling algorithms, networking models, and operational patterns
  6. 6
    Leadership and Collaboration Discussion Conversation exploring your approach to architecture evangelism, how you influence engineering practices, mentoring approach, and track record of elevating team technical capabilities
  7. 7
    Hiring Manager Interview Discussion with engineering leadership regarding vision alignment, appetite for solving hard distributed systems problems, working style in fast-moving environments, and long-term technical growth aspirations

Apply for this position

You'll be redirected to the company's application page


Snowflake

Snowflake

View all jobs

Snowflake is an American cloud computing company offering data warehousing and analytics platforms.

Bozeman, Montana, United StatesFounded 2012snowflake.com

Tech Stack

Languages
JavaPythonGoC++
Frameworks
KubernetesMesosOpenShift
Databases
FoundationDBRocksDBDynamoDBCassandraRedis
Tools
Amazon Web Services (AWS)Microsoft AzureGoogle Cloud Platform (GCP)TerraformDocker
Other
Distributed Systems DesignMulti-threading and ConcurrencyOperating SystemsCloud Infrastructure AutomationHDFS and Big Data Technologies

Interview Guides

11 guides available for Snowflake

Apply Now