Poshmark

Senior Site Reliability Engineer, (Production Excellence)

Poshmark2 months ago
Location

Redwood City, California, USA

Type

Full Time

Salary

USD 155,900 – 261,100

Level

Senior

Role

Site Reliability Engineer

Posted

Dec 22, 2025

Full TimeSenior

The role

Summary

Poshmark is seeking a Senior Site Reliability Engineer to architect and optimize their web-scale infrastructure, focusing on automation, system reliability, and proactive performance management. The ideal candidate will bring expertise in cloud operations, infrastructure as code, and production support to enhance Poshmark's mission-critical technology ecosystem.

What you'll do

Production System Accountability: Serve as the primary point of accountability for health, performance, and capacity of mission-critical, internet-facing services
Infrastructure Design Partnership: Collaborate with development teams from design phase to ensure platforms are built with operability and recoverability as core principles
Automation and Monitoring: Improve and develop tools for automated deployment and monitoring of custom applications in large-scale UNIX environments
On-Call Support: Participate in a structured 12x7 on-call rotation to maintain 24/7 production environment support
Incident Management: Lead incident response and conduct blameless post-mortems to continuously improve system reliability

What we look for

Technical

Cloud PlatformsExpert-level experience with AWS, GCP, or Azure cloud infrastructure
ContainerizationAdvanced knowledge of Kubernetes and Docker
CI/CD ToolsProficiency in Jenkins, Ansible, and Terraform

Education

DegreeBachelor's degree in Computer Science, Software Engineering, or related technical field preferred

Experience

Production Support5-8+ years of experience in Systems Engineering or Site Reliability roles, preferably in startup or fast-growing environments
Large-Scale OperationsProven track record in UNIX-based, large-scale web operations

Skills

Required skills

Infrastructure as CodeStrong scripting and coding skills for infrastructure automation
Observability ToolsExperience with Datadog, New Relic, Graphite, or Nagios

Nice to have

Incident ResponseExperience in leading complex incident management and post-mortem analyses
Performance OptimizationProven ability to design high-fidelity monitoring and alerting systems

Compensation & benefits

Salary

USD 155,900 – 261,100 (annual)

Stock options

Available

Benefits

Hybrid Work

Flexible work arrangement based in Redwood City, CA

Tech Community

Opportunity to work with a dynamic social commerce platform serving 165 million members


Interview process

  1. 1
    Initial Screening Phone or video call with recruiting team to assess background and experience
  2. 2
    Technical Assessment Comprehensive technical interview focusing on SRE skills, system design, and problem-solving
  3. 3
    Onsite/Virtual Interviews Multiple rounds of interviews with engineering and leadership teams, including system design and cultural fit assessments
  4. 4
    Final Review Candidate evaluation and decision-making by hiring committee

Apply for this position

You'll be redirected to the company's application page