Cohere

Senior Technical Program Manager, Machine Learning Infrastructure

Cohere1 weeks ago
Location

Canada

Workplace

Remote

Type

Full Time

Salary

USD 160,000 – 220,000

Level

Senior

Role

Technical Program Manager

Posted

Jun 23, 2026

Full TimeRemoteSenior

The role

Summary

Senior Technical Program Manager overseeing Machine Learning infrastructure programs at Cohere, a leading enterprise AI company. This strategic role manages inference, serving, efficiency, and endpoint systems while coordinating cross-functional efforts between ML modeling teams and customer-facing operations. Requires 5+ years of technical program management experience in ML infrastructure with strong execution capabilities in fast-paced, complex environments.

What you'll do

ML Infrastructure Program Portfolio Management: Own the complete program portfolio spanning model inference, serving efficiency, and endpoint infrastructure. Ensure systems scale reliably to support rapidly expanding internal and external user bases while maintaining performance standards and architectural integrity across distributed systems.
Cross-Functional Coordination and Execution: Lead end-to-end program coordination with particular emphasis on orchestrating collaboration between ML modeling teams, infrastructure engineers, and customer-facing product teams. Drive alignment on technical dependencies, timeline requirements, and resource allocation across organizational boundaries.
Process Optimization and Best Practices: Identify infrastructure pain points and establish operational processes that enable engineering teams to maintain high development velocity while consistently meeting stakeholder needs. Implement and refine engineering best practices to balance rapid iteration with system reliability.
Incident Management and Continuous Improvement: Partner with incident management leads to foster a culture of continuous improvement. Ensure thorough root cause analysis, timely issue resolution, and effective implementation of preventative measures. Track and communicate learnings across the organization.
Strategic Prioritization and Program Management: Ruthlessly prioritize competing demands and overlapping projects to align with company strategic objectives. Manage scope creep, resource constraints, and competing stakeholder needs while maintaining focus on highest-impact deliverables.
Stakeholder Communication and Program Tracking: Establish and maintain clear tracking of timelines, deliverables, budgets, and scope across all programs. Communicate consistent, timely updates to engineering teams, leadership, and non-technical stakeholders on program status, risks, and incidents.
Strategic and Tactical Partnership: Serve as trusted partner to senior technical leaders, providing both granular tactical support on day-to-day execution and strategic counsel on high-level problems with company-wide impact. Bridge communication between technical depth and strategic vision.

What we look for

Technical

ML Infrastructure Systems KnowledgeIn-depth understanding of machine learning infrastructure architecture including model inference systems, serving frameworks, distributed systems, and performance optimization techniques.
Hands-On Technical ExperienceDirect technical experience in software engineering, infrastructure engineering, or related roles. Practical familiarity with building scalable systems, debugging complex technical issues, and understanding engineering trade-offs.
Machine Learning FundamentalsWorking knowledge of how machine learning models are trained, deployed, and served in production. Understanding of model performance characteristics, inference latency requirements, and serving infrastructure patterns.
Program Management Tools and MethodologiesProficiency with project tracking tools, methodologies for managing technical programs (Agile, Scrum, Waterfall variants), and approaches for stakeholder communication and status reporting.

Education

Bachelor's Degree in Computer Science or Related FieldFormal education in computer science, software engineering, mathematics, or related technical discipline providing foundational understanding of computational systems and technical principles.

Experience

Technical Program Management5+ years of proven technical program management experience with demonstrated success managing complex, cross-functional programs in fast-paced environments.
ML Infrastructure SpecializationFocused experience in machine learning infrastructure domains including model inference optimization, model serving architecture, computational efficiency, and endpoint design and implementation.
Fast-Paced Environment ExecutionDemonstrated ability to operate effectively in chaotic, low-structure, rapidly-scaling environments. Proven pragmatism, resourcefulness, and capability to perform tactical hands-on work while maintaining strategic perspective.
Technical Depth and Subject Matter ExpertiseStrong working knowledge of ML infrastructure systems design and implementation. Ability to develop deep technical understanding of internal systems and establish credibility as a subject matter expert in the domain.

Skills

Required skills

Technical Program ManagementExpert-level ability to manage complex multi-team technical programs with clear ownership of scope, timeline, budget, and resource allocation across distributed teams.
Cross-Functional LeadershipProven ability to lead and influence without direct authority. Drive alignment and collaboration between engineering teams, research teams, and non-technical stakeholders with competing priorities.
ML Infrastructure KnowledgeDeep technical knowledge of machine learning infrastructure domains including inference, serving, efficiency optimization, and scalable endpoint architecture.
Execution and Problem SolvingExceptional execution capability with pragmatic approach to problem-solving. Comfort with rolling up sleeves to unblock teams and drive tactical outcomes while maintaining strategic perspective.
Communication and Stakeholder ManagementClear, concise communication adapted to audience technical level. Ability to explain complex infrastructure concepts to non-technical stakeholders and translate business needs into technical requirements.
Strategic ThinkingCapability to think strategically about program architecture, organizational processes, and long-term technical direction while executing tactically on immediate priorities.
Incident Response and Root Cause AnalysisExperience leading incident response processes, facilitating root cause analysis, and driving implementation of preventative measures to reduce recurring issues.

Nice to have

Hands-On Technical BackgroundDirect experience working as a software engineer, infrastructure engineer, or ML engineer providing practical understanding of technical trade-offs and engineering challenges in building scalable systems.
Machine Learning Team CollaborationDemonstrated experience working closely with machine learning teams. Deep understanding of research engineering dynamics, model development workflows, and intersection between research velocity and infrastructure stability.
Distributed Systems UnderstandingWorking knowledge of distributed systems architecture, scalability challenges, and reliability patterns relevant to managing high-throughput, low-latency inference and serving infrastructure.
Enterprise AI or LLM ExperienceExperience working with large language models, foundation models, or enterprise AI systems. Understanding of unique challenges in deploying and scaling cutting-edge AI models for production use.
Cost Optimization and Resource PlanningTrack record of optimizing infrastructure costs while maintaining performance requirements. Experience with capacity planning and resource allocation decisions balancing business needs with technical constraints.

Compensation & benefits

Salary

USD 160,000 – 220,000 (annual)

Stock options

Available

Benefits

Weekly Lunch Stipend

Weekly lunch allowance of $75 USD (or equivalent in local currency) to support meal expenses and team building opportunities.

Comprehensive Health and Dental Coverage

Full health and dental benefits with dedicated mental health budget supporting employee wellness across physical and mental health domains.

Retirement Planning

RRSP matching (Canada), 401(k) contributions (US), or Pension Scheme (UK) depending on employment location, supporting long-term financial security.

Parental Leave

100% parental leave top-up for up to 6 months for either parent, supporting family planning and work-life balance.

Annual Enrichment Benefits

Dedicated budgets for arts and culture experiences, fitness and wellness programs, quality time and life experiences, plus annual workspace improvement credit enhancing work environment.

Education and Learning Stipend

Annual budget supporting professional development through conferences, courses, coaching, and continuous learning opportunities aligned with career growth.

Paid Time Off

6 weeks of paid vacation (30 working days annually) providing substantial time for rest, travel, and personal pursuits.

Office Travel and Offsite Opportunities

Budget for traveling to other Cohere offices for remote employees plus attendance at annual company offsite events for team connection and strategic alignment.

Remote Work Setup

One-time $500 home office stipend to establish professional workspace setup supporting productive remote work environment.

Co-working Benefits

Access to co-working spaces in your local city for remote employees not near Cohere offices, enabling flexible professional workspace options.

Office Amenities

For in-office team members: daily lunch program, comprehensive snacks, and regular community and social events fostering team connection.


Interview process

  1. 1
    Application Review Your application will be reviewed by Cohere recruiters against the job criteria. The company leverages AI-enabled tools to help identify potentially qualified candidates while maintaining human review of all applications.
  2. 2
    Recruiter Screening Initial conversation with a Cohere recruiter to discuss your background, experience with ML infrastructure programs, and fit for the Technical Program Manager role. Expect to discuss your approach to managing complex cross-functional initiatives.
  3. 3
    Hiring Manager Discussion Conversation with the ML Infrastructure team's leadership to explore your technical depth, program management philosophy, and approach to working in fast-paced, ambiguous environments. Be prepared to discuss specific examples of infrastructure programs you've managed.
  4. 4
    Technical Program Management Assessment Detailed discussion or case study focused on your program management capabilities. You may be presented with infrastructure challenges similar to those at Cohere and asked how you would approach prioritization, stakeholder communication, and execution.
  5. 5
    Cross-Functional Stakeholder Conversations Potential discussions with engineering team members, ML researchers, or product leaders you would collaborate with. These conversations assess your ability to communicate across technical and non-technical audiences and build collaborative relationships.
  6. 6
    Executive Alignment Final conversation with senior leadership to ensure organizational fit, discuss your vision for the ML infrastructure program, and clarify questions about Cohere's technical direction and growth trajectory.

Apply for this position

You'll be redirected to the company's application page