Apollo GraphQL

Senior Software Engineer - Trust and Telemetry

Apollo GraphQL3 days ago
Location

United States or Canada (remote)

Workplace

Remote

Type

Full Time

Salary

USD 165,000 – 195,000

Level

Senior

Role

Backend Engineer

Posted

Jun 30, 2026

Full TimeRemoteSenior

The role

Summary

Senior Software Engineer on the Trust and Telemetry team at Apollo GraphQL, responsible for owning subdomains including OAuth/SSO flows, audit logging, telemetry ingestion, and fine-grained permissions. This role requires 2+ years of backend service experience, distributed systems expertise, and production incident management skills to build enterprise-grade identity, access management, and observability infrastructure for GraphOS.

What you'll do

Own a TnT Subdomain: Take full technical responsibility for a critical component within the Trust and Telemetry platform, whether OAuth/SSO flows, audit logging infrastructure, telemetry ingestion pipelines, or fine-grained permissions models. This ownership includes architecture decisions, code quality, and system reliability.
Drive Features End-to-End: Lead features through the complete development lifecycle—write technical design documents, align with stakeholders, implement solutions, write comprehensive tests, coordinate deployments, and monitor production performance using observability tools and metrics.
Build for Enterprise Reliability: Design systems with production-grade observability, graceful failure modes, and incremental delivery strategies. Participate in on-call rotations, incident response, and postmortem analysis to maintain high availability and service resilience for enterprise customers.
Collaborate Across Functions: Work directly with Product and Design teams to scope requirements, define technical trade-offs, and prioritize work. Communicate asynchronously across distributed teams using design documents, design discussions, and architecture reviews.
Contribute to Technical Growth: Participate actively in code reviews, architectural design discussions, and technical standard development. Mentor team members, receive constructive feedback, and help establish best practices across the Trust and Telemetry organization.
Telemetry Platform Development: Build and maintain pipelines, APIs, and user interfaces that transform raw telemetry data into actionable insights. Enable teams to safely evolve and scale their GraphQL graphs with usage, performance, and schema visibility.
Trust and Identity Infrastructure: Develop identity and access management systems including OAuth and SSO implementation, fine-grained role-based tokens, and comprehensive audit logging. Ensure secure, compliant access control for enterprise GraphOS customers.

What we look for

Technical

Distributed Systems DesignDeep understanding of consistency models, failure modes, network partitions, and resiliency patterns. Ability to architect systems that handle Byzantine failures, eventual consistency, and graceful degradation.
Production Backend ServicesDemonstrated experience designing, building, and operating backend services in production environments at scale. Knowledge of deployment strategies, rolling updates, and feature flags for zero-downtime deployments.
API and Data Model ArchitectureExperience designing APIs and data models that other teams and customers depend on. Ability to maintain backward compatibility, design for extensibility, and document public interfaces comprehensively.
Modern Backend LanguagesProduction proficiency in at least one modern backend language such as TypeScript/Node.js, Go, Rust, Java, or Kotlin. Ability to write clean, maintainable code with strong typing and error handling.
Database TechnologiesHands-on experience with relational databases (PostgreSQL, MySQL) and at least one of key-value stores (Redis), document databases (MongoDB), or time-series databases (InfluxDB, TimescaleDB). Understanding of indexing, query optimization, and data access patterns.
Production Operations and Incident ManagementHands-on ownership of services in production including on-call rotations, incident response, debugging, and postmortem analysis. Comfort with pagers, escalation procedures, and rapid incident mitigation.
Observability and MonitoringPractical experience with metrics collection, distributed tracing, structured logging, and SLO/SLI definition. Familiarity with observability platforms like Datadog, New Relic, Prometheus, Grafana, or Jaeger for monitoring system health.

Education

Bachelor's Degree in Computer Science or Related FieldFormal education in Computer Science, Software Engineering, or related discipline providing foundational knowledge in algorithms, data structures, and systems design. Advanced degrees are optional but valued.
Continuing Education in Modern Distributed SystemsDemonstrated commitment to staying current with distributed systems concepts, cloud-native architecture patterns, and backend engineering best practices through courses, conferences, or technical reading.

Experience

Backend Service DevelopmentMinimum 2+ years of hands-on experience designing and building backend services in production environments. Experience scaling systems to support enterprise workloads and managing service dependencies.
Design Document CollaborationExperience writing clear, detailed technical design documents and participating in architectural design reviews. Comfortable with async-first communication and cross-functional alignment on technical decisions.
API-First DevelopmentBackground building APIs that serve multiple internal or external consumers. Understanding of API versioning, backwards compatibility, deprecation strategies, and developer experience.
Database AdministrationOperational experience managing databases in production, including performance tuning, scaling strategies, backup/restore procedures, and schema migrations.

Skills

Required skills

Backend Service ArchitectureDesign scalable, resilient backend systems with clean abstractions, proper separation of concerns, and production-ready patterns. Architect for high availability, fault tolerance, and operational visibility.
Distributed Systems FundamentalsMaster consistency models (CAP theorem, ACID vs BASE), failure detection and recovery, consensus algorithms, and replication strategies. Apply these concepts to real-world systems.
API Design and DevelopmentDesign and implement RESTful or GraphQL APIs with strong contracts, comprehensive documentation, and developer-friendly interfaces. Handle versioning, authentication, rate limiting, and error handling elegantly.
Database Design and Query OptimizationModel data efficiently for relational and NoSQL databases. Write optimized queries, design appropriate indexes, understand query execution plans, and troubleshoot performance issues.
Incident Response and Postmortem AnalysisRespond effectively to production incidents, diagnose root causes using logs and metrics, implement rapid mitigations, and conduct blameless postmortems to prevent recurrence.
Observability ImplementationInstrument applications with structured logging, metrics, and distributed tracing. Define meaningful SLOs/SLIs, set up alerting, and use observability data to guide debugging and optimization.
Technical CommunicationWrite clear design documents, participate effectively in code reviews, articulate complex technical concepts to non-technical stakeholders, and collaborate asynchronously across time zones.

Nice to have

GraphQL ExpertisePractical experience designing and implementing GraphQL servers, understanding query planning, schema stitching, and federation. Knowledge of GraphQL best practices for security and performance.
OAuth 2.0 and Enterprise SSOExperience implementing OAuth 2.0 and OpenID Connect flows, SAML integration, or enterprise SSO systems. Understanding of token management, authorization code flows, and single sign-on architectures.
Identity and Access ManagementBackground in IAM systems, role-based access control (RBAC), attribute-based access control (ABAC), fine-grained permissions models, and audit logging for compliance frameworks.
Security and Compliance EngineeringExperience building security-critical systems with threat modeling, secure coding practices, compliance requirements (SOC 2, GDPR, HIPAA), and security auditing capabilities.
Cloud-Native ArchitectureHands-on experience with containerization (Docker), orchestration (Kubernetes), service meshes (Istio, Linkerd), and cloud platforms (AWS, GCP, Azure). Understanding of cloud-native patterns and tradeoffs.
Telemetry and Analytics InfrastructureExperience building data pipelines, time-series databases, analytics systems, or metrics aggregation platforms. Knowledge of data modeling for observability and analytics use cases.
AI Agent IntegrationFamiliarity with AI agent architectures, tool calling/function calling patterns, agentic middleware, and LLM integration patterns for intelligent system automation.

Compensation & benefits

Salary

USD 165,000 – 195,000 (annual)

Stock options

Available

Benefits

Remote Work Flexibility

Full remote work opportunity with a distributed team across multiple time zones, providing flexibility in work location and schedule while maintaining async-first collaboration practices.

Equity and Stock Options

Competitive equity package providing ownership stake in Apollo GraphQL, a high-growth company in the developer tools space with strong market position and growth trajectory.

Professional Development

Learning and development budget for conferences, courses, and technical certifications. Access to internal mentorship, code review feedback from senior engineers, and growth opportunities into leadership roles.

Health and Wellness Benefits

Comprehensive health coverage including medical, dental, and vision insurance. Mental health support, fitness benefits, and flexible time off policies supporting work-life balance.

Impact on Open Source

Work on widely-used open-source GraphQL tools and contribute to the broader GraphQL ecosystem. Build features and infrastructure used by thousands of developers globally.

Enterprise Customer Base

Work directly with enterprise customers on mission-critical infrastructure. Your work impacts Fortune 500 companies and high-growth startups using Apollo GraphQL's platform.


Apply for this position

You'll be redirected to the company's application page


Apollo GraphQL

Apollo GraphQL

View all jobs

Apollo GraphQL develops open-source GraphQL tools and a leading GraphQL implementation platform, empowering teams to build, query, and manage APIs efficiently.

Remote, USAFounded 2015apollographql.com

Tech Stack

Languages
TypeScript/Node.jsGoRustJava
Frameworks
Express.js or FastifyGraphQL Apollo ServerSpring BootgRPC
Databases
PostgreSQLRedisMongoDBTimescaleDB or InfluxDB
Tools
KubernetesDockerGit and GitHubCI/CD PipelinesDatadog or Prometheus
Other
OAuth 2.0 and OpenID ConnectDistributed TracingService MeshesIncident Management
Apply Now