Prinicpal Software Engineer - Data Engineering & Streaming Primitives

Snowflake5 days ago

Location

US-WA-Bellevue

Type

Full Time

Salary

USD 264,000 – 379,500

Level

Principal

Role

Principal Engineer

Posted

Jun 24, 2026

Full TimePrincipal

The role

Summary

Principal Software Engineer leading the design and architecture of Snowflake's core data engineering primitives including streaming and transformation systems. This role requires deep expertise in distributed systems, stream processing, and declarative query execution at petabyte scale, combined with proven technical leadership to drive multi-quarter initiatives across cross-functional teams and influence the data engineering ecosystem.

What you'll do

Define Technical Direction for Data Engineering Primitives: Establish and drive the strategic technical vision for Snowflake's core streaming and transformation primitives including Streams, Tasks, Dynamic Tables, and adjacent pipeline constructs. This involves setting architectural direction, making trade-off decisions between performance, scalability, and correctness that impact millions of customer workloads, and ensuring these primitives compose cleanly across the broader data engineering stack.

Lead Multi-Quarter Technical Initiatives: Identify, prioritize, and lead major technical investments spanning performance optimization, horizontal and vertical scalability, correctness guarantees, and reliability improvements. Transform ambiguous problem spaces into concrete engineering plans with measurable outcomes, timelines, and success metrics. Own the execution from architecture phase through production deployment at petabyte scale.

Partner on Cross-Functional Primitive Design: Collaborate with product management, research teams, and peer engineering organizations to co-design data engineering primitives that are both powerful for customers and elegant in their implementation. Participate in design reviews, validate assumptions against real-world customer usage patterns, and ensure API design decisions enable future extensibility.

Operate as an Architectural Force Multiplier: Set and maintain the technical bar for the organization by running rigorous architectural reviews, providing high-quality feedback on design documents, and conducting architectural decision records. Mentor and sponsor emerging engineering talent through direct coaching, enabling engineers at all levels to grow their distributed systems expertise and make better technical decisions.

Gather Customer Signal and Real-World Validation: Work directly with customers, field teams, and internal stakeholders to understand real-world usage patterns, failure modes, and emerging use cases for data pipelines. Use this signal to inform prioritization, validate architectural assumptions, and identify opportunities for improvement or new primitives that solve actual customer pain points.

Build Technical Reputation and Thought Leadership: Contribute to Snowflake's standing in the data engineering community through internal design influence, external conference talks, peer-reviewed research publications, or contributions to open standards. Represent Snowflake's technical expertise in streaming systems, distributed data processing, and cloud-scale data infrastructure.

What we look for

Technical

Distributed Systems Design and ImplementationExpert-level knowledge of distributed systems concepts including consistency models (strong, eventual, causal), fault tolerance mechanisms, consensus algorithms, and correctness proofs. Demonstrated ability to reason about system behavior under failure conditions, network partitions, and high concurrency scenarios with thousands of concurrent workloads.

Stream Processing ArchitectureDeep expertise in building or operating stream processing systems, including windowing semantics, state management, exactly-once or at-least-once guarantees, backpressure handling, and optimization of end-to-end latency. Understanding of both micro-batching and true streaming execution models.

Declarative Query Execution and OptimizationStrong background in designing declarative query languages, query optimization techniques, execution planning, and code generation. Experience optimizing query plans for throughput and resource utilization at cloud scale, including distributed execution across multiple nodes.

Systems-Level Performance EngineeringProficiency in analyzing and optimizing systems for latency, throughput, and resource efficiency in cloud environments. Experience with profiling tools, identifying bottlenecks, and implementing optimizations that compound across millions of concurrent workloads.

C++ or Java Systems ProgrammingProduction-level proficiency in either C++ (preferred for performance-critical systems) or Java for building reliable, maintainable distributed systems. Understanding of memory management, concurrency primitives, and performance characteristics relevant to your language choice.

Cloud-Scale Distributed Data SystemsHands-on experience designing, building, or operating large-scale data infrastructure handling petabyte-scale workloads. Understanding of cloud provider primitives, resource management, cost optimization, and operational patterns for massively distributed systems.

Education

Bachelor's Degree in Computer Science or Related FieldStrong foundational computer science education with coverage of algorithms, data structures, and systems concepts. Equivalent professional experience demonstrating mastery of computer science fundamentals is acceptable.

Advanced Degree (Preferred)Master's or PhD in Computer Science with emphasis on databases, distributed systems, or data management. While not required, advanced degrees often indicate deeper theoretical foundation in the relevant domains. Publication record in peer-reviewed venues is a strong signal of expertise.

Experience

15+ Years Building Large-Scale Distributed Data SystemsExtensive career experience designing and building distributed data infrastructure, with at least 15 years focused on systems that handle massive scale. This experience should span multiple phases of a system's lifecycle: architectural design, implementation, scaling to production, operational hardening, and evolution based on customer feedback.

Multi-Quarter Technical Leadership and ArchitectureProven ability to lead significant technical initiatives from blank-page architecture through production deployment. Track record of successfully navigating complex technical trade-offs, coordinating across multiple engineering teams, and shipping systems that handle thousands of concurrent workloads at scale.

Production Systems at Petabyte ScaleDirect experience operating systems handling petabyte-scale data volumes and thousands of concurrent customer workloads. Understanding of production operational challenges including monitoring, alerting, incident response, and capacity planning at this scale.

Cross-Functional Technical CollaborationDemonstrated ability to work effectively with product management, research teams, field engineers, and customers to understand requirements, validate assumptions, and prioritize initiatives. Track record of influencing technical direction through persuasive communication and architectural excellence.

Skills

Required skills

Distributed Systems ArchitectureDesign and implementation of systems that operate reliably across multiple machines, handling failures, consistency guarantees, and coordination challenges. Understanding of distributed consensus, replication strategies, and failure recovery patterns.

Stream Processing SystemsIn-depth knowledge of stream processing architectures, windowing functions, state management, exactly-once semantics, and optimization techniques for real-time data pipelines.

C++ or JavaProduction proficiency in either C++ (for systems-level performance work) or Java (for large-scale distributed systems). Ability to write maintainable, performant code in your chosen language at scale.

Database Systems KnowledgeUnderstanding of database internals including query optimization, execution engines, storage engines, indexing strategies, and transaction semantics. Knowledge of both OLTP and OLAP/analytical database patterns.

Technical Leadership and ArchitectureAbility to design complex systems, communicate architectural trade-offs clearly, drive consensus among peers, and lead technical initiatives from conception through production. Strong written and verbal communication for explaining distributed systems concepts to technical and non-technical audiences.

Fault Tolerance and Consistency ReasoningDeep understanding of different consistency models (strong, eventual, causal), failure scenarios in distributed systems, and design patterns for building fault-tolerant systems. Ability to reason formally about correctness under failures.

Nice to have

Apache Flink ExperienceHands-on experience with Apache Flink for building stream processing applications, understanding its execution model, checkpointing mechanisms, and state backend implementations.

Apache Kafka ExpertiseDeep familiarity with Apache Kafka including broker architecture, consumer group semantics, replication and durability, and patterns for building event-driven systems with Kafka.

Spark Structured StreamingExperience building real-time applications using Spark Structured Streaming, including understanding of micro-batching execution, state management, and integration with Delta Lake or other storage formats.

Analytical Database Systems (DBMS)Production experience with major analytical platforms like Snowflake, Google BigQuery, Amazon Redshift, Databricks, or Teradata. Understanding of their execution engines, optimization strategies, and scale characteristics.

Data Engineering Ecosystem ToolsPractical knowledge of modern data engineering tools including dbt for transformation, Apache Airflow for orchestration, Fivetran for data integration, Apache Iceberg or Delta Lake for table formats, and CDC (Change Data Capture) systems.

Change Data Capture (CDC) SystemsExperience designing or implementing CDC solutions, understanding of log-based CDC versus query-based approaches, and patterns for handling incremental computation and data propagation across systems.

Research Publications or External Thought LeadershipPublication record in peer-reviewed databases or distributed systems venues, conference talk experience, or recognized contributions to the data engineering or streaming systems community.

Compensation & benefits

Salary

USD 264,000 – 379,500 (annual)

Stock options

Available

Benefits

Comprehensive Health Coverage

Medical, dental, and vision insurance with options to tailor coverage to individual needs. Coverage extends to dependents and family members.

Retirement Plans

401(k) retirement savings plan with company matching contributions, enabling long-term wealth building for retirement.

Equity Compensation

Stock options or restricted stock units (RSUs) as part of the total compensation package, enabling you to participate in Snowflake's long-term growth and success.

Flexible Time Off

Unlimited or generous paid time off policy allowing you to maintain work-life balance and take time for personal needs, travel, and family.

Professional Development

Learning and development budgets to support continuous skill development, conference attendance, certification programs, and technical training in distributed systems, cloud platforms, or specialized tools.

Remote Work Flexibility

Flexible work arrangements with options for remote work, allowing you to balance office collaboration with focused work from home.

Mental Health and Wellness

Mental health support including counseling services, wellness programs, fitness benefits, and resources for maintaining overall well-being.

Parental Leave

Paid parental leave for childbirth or adoption, supporting family building and work-life balance.

Apply for this position

You'll be redirected to the company's application page