Senior/Staff Software Engineer - Machine Learning Platform (Inference)

Snowflake8 months ago

Location

US-CA-Menlo Park

Type

Full Time

Salary

USD 236,000 – 339,250

Level

Senior

Role

ML Engineer

Posted

Aug 21, 2025

Full TimeSenior

The role

Summary

Snowflake is seeking a Senior/Staff Software Engineer to join their Machine Learning Platform team, focusing on building inference infrastructure for ML and LLM workloads. The role involves designing and implementing state-of-the-art ML serving systems, working with cutting-edge inference engines, and scaling AI capabilities for enterprise customers in Snowflake's cloud data platform.

What you'll do

ML Platform Architecture: Define and own roadmap for next-generation machine learning platform, collaborating with senior architects, PMs, and leadership

LLM Inference Systems: Design and implement state-of-the-art inference engines for serving large language models at enterprise scale

Technical Vision: Build and execute strategic vision for incorporating new advances in machine learning to achieve business objectives

Operational Excellence: Ensure reliability, availability, and performance commitments are met for ML services serving enterprise customers

Cross-Team Collaboration: Partner with ML teams across Snowflake to improve development velocity and capabilities

Technical Leadership: Support team members in delivering high-quality technical solutions and architectural decisions

Innovation Delivery: Drive innovation in AI/ML capabilities for thousands of enterprise customers on Snowflake platform

What we look for

Technical

ML Platform Experience7+ years designing, building, and supporting Internet serving infrastructure and machine learning platforms

LLM Inference EnginesHands-on experience with vLLM, TensorRT-LLM, TEI, SGLang and understanding tradeoffs between inference engines

Fine-tuned LLM ServingExperience serving fine-tuned LLMs using PEFT, DPO, and reinforcement learning techniques

ML FrameworksProficiency with SKLearn, XGBoost, PyTorch, TensorFlow, and MLflow

ML Serving SystemsExperience building both batch and real-time ML serving systems for production environments

Education

Degree RequirementBS/MS/PhD in Computer Science or related technical field, or equivalent industry experience

Experience

Technical LeadershipTrack record of building roadmaps and leading technical decision making for machine learning teams

Platform ArchitectureExperience with large-scale distributed systems and cloud infrastructure

Enterprise SystemsBackground in building reliable, scalable systems for enterprise customers

Skills

Required skills

Machine Learning Platforms7+ years experience designing and building ML infrastructure

LLM InferenceExpertise with modern inference engines like vLLM, TensorRT-LLM, TEI, SGLang

Distributed SystemsStrong background in scalable, reliable system architecture

Python/JavaProficiency in primary programming languages for ML platform development

Cloud InfrastructureExperience with cloud platforms and containerization technologies

Nice to have

Fine-tuning TechniquesExperience with PEFT, DPO, and reinforcement learning for LLM optimization

MLOpsExperience with ML lifecycle management and deployment automation

Real-time ServingBackground in low-latency, high-throughput ML serving systems

Data WarehousingFamiliarity with data warehouse architectures and analytics platforms

Technical LeadershipExperience mentoring teams and driving technical strategy

Compensation & benefits

Salary

USD 236,000 – 339,250 (annual)

Stock options

Available

Benefits

Equity Compensation

Stock options and equity participation in Snowflake's growth

Health Insurance

Comprehensive medical, dental, and vision coverage

Professional Development

Learning and development opportunities in cutting-edge AI/ML technologies

Innovation Culture

Environment focused on impact, innovation, and collaboration

Career Growth

Opportunities to advance in rapidly growing cloud computing company

Interview process

1
Initial Screen — Technical recruiter discussion about background, experience, and role alignment
2
Technical Phone Screen — 45-60 minute technical discussion covering ML systems design and inference optimization
3
System Design Interview — Deep dive into designing scalable ML platform architecture and LLM inference systems
4
Technical Deep Dive — Detailed technical discussion about specific inference engines, fine-tuning approaches, and performance optimization
5
Behavioral Interview — Leadership scenarios, collaboration examples, and cultural fit assessment
6
Final Round — Panel interviews with senior leadership and team members covering technical vision and strategic thinking