Airwallex

Senior ML Platform Engineer, AI Platform

Airwallex3 months ago
Location

SG - Singapore

Type

Full Time

Level

Senior

Role

ML Engineer

Posted

Nov 19, 2025

Full TimeSenior

The role

Summary

Airwallex is seeking a Senior ML Platform Engineer to build next-generation machine learning infrastructure for their new AI team. The role involves designing and maintaining MLOps platforms using Kubernetes and cloud services, implementing CI/CD/CT pipelines, and building high-performance model serving infrastructure. The ideal candidate has 5+ years of backend development experience with 2+ years focused on AI/ML platforms, expertise in Python and distributed systems, and experience with MLOps practices including automated deployment pipelines and production lifecycle management.

What you'll do

Platform Development: Design, build, and maintain end-to-end MLOps platform using Kubernetes and cloud services
Infrastructure as Code: Use Terraform to manage, provision, and scale ML-related infrastructure securely and efficiently
Pipeline Automation: Implement and optimize CI/CD/CT pipelines for model training, testing, packaging, and deployment using Argo and Kubeflow Pipelines
Model Serving Infrastructure: Build highly available, low-latency, and high-throughput model serving infrastructure
Observability Implementation: Implement robust monitoring, alerting, and logging solutions to track infrastructure health, model performance, and data/model drift
ML Tooling Support: Evaluate, integrate, and support ML tools such as Feature Stores and distributed model training pipelines
Security & Compliance: Ensure platform security, implement RBAC, and manage secrets for sensitive data and production environments
Cross-functional Collaboration: Work closely with Data Scientists and ML Engineers to understand needs and provide technical guidance on scaling best practices
LLM Platform Development: Contribute to the evolution of unified AI Platform covering both traditional ML and growing LLM capabilities
Performance Optimization: Optimize model serving solutions for low-latency, high-throughput production environments

What we look for

Technical

Backend Development5+ years experience in backend software development
MLOps Expertise2+ years focused on AI/ML Platform or MLOps infrastructure
Model ServingProven experience designing and implementing low-latency model serving solutions
Python ProficiencyStrong programming skills in Python for ML platform development
Distributed SystemsExperience in design and development of large-scale distributed, high concurrency, low-latency systems
Code QualityAbility to write high-quality, maintainable code
Production MLOpsDeep expertise in MLOps practices including automated deployment pipelines and production lifecycle management

Education

Bachelor's DegreeRelevant degree in Computer Science, Mathematics or related technical fields

Experience

Communication SkillsExcellent communication and mentoring abilities for cross-functional collaboration
Infrastructure ManagementExperience with cloud infrastructure and Kubernetes for ML workloads
LLM OptimizationPreferred: Working knowledge of LLM serving optimization and GPU resource management
Distributed ComputingPreferred: Familiarity with distributed compute/training frameworks like Ray and Spark

Skills

Required skills

PythonPrimary programming language for MLOps development
KubernetesContainer orchestration for ML platform infrastructure
TerraformInfrastructure as Code for ML infrastructure management
MLOps PracticesAutomated deployment pipelines and production lifecycle management
Distributed SystemsLarge-scale, high-performance system design and implementation
CI/CD/CT PipelinesContinuous integration, delivery, and training automation
Model ServingLow-latency, high-throughput model deployment solutions

Nice to have

RayDistributed computing framework for ML workloads
Apache SparkBig data processing and distributed computing
KubeflowML workflow orchestration on Kubernetes
vLLMLLM serving optimization
Triton Inference ServerAI model serving platform
GPU ManagementResource optimization for ML training and inference
Feature StoresML feature management and serving
Cloud PlatformsAWS, GCP, or Azure for ML infrastructure

Compensation & benefits

Benefits

Global Team

Work with over 2,000 innovative people across 26 offices worldwide

Career Growth

Accelerated learning and true ownership in a high-growth fintech environment

Cutting-edge Technology

Work on next-generation AI and ML platforms with modern tech stack

Impact-driven Work

Build solutions that serve over 200,000 businesses globally including major brands

Innovation Culture

Join a brand-new AI team driving innovation in financial technology

Equal Opportunity

Inclusive workplace that values diversity and provides equal opportunities


Interview process

  1. 1
    Application Review Initial screening of resume and technical background focusing on MLOps experience
  2. 2
    Technical Phone Screen Discussion of MLOps concepts, system design, and Python programming skills
  3. 3
    System Design Interview Design ML platform architecture, discussing scalability, performance, and reliability
  4. 4
    Coding Interview Python coding assessment with focus on infrastructure automation and ML pipeline development
  5. 5
    Technical Deep Dive In-depth discussion of previous MLOps projects, Kubernetes experience, and platform engineering
  6. 6
    Team Fit Interview Cultural fit assessment and discussion with potential team members and stakeholders
  7. 7
    Final Interview Leadership interview focusing on collaboration skills, mentoring abilities, and long-term vision

Apply for this position

You'll be redirected to the company's application page


Airwallex

Airwallex

View all jobs

Airwallex is a Singapore-based financial technology company specializing in cross-border payments and financial services for businesses.

SingaporeFounded 2015airwallex.com

Tech Stack

Languages
PythonSQL
Frameworks
Kubeflow PipelinesArgo WorkflowsRayApache Spark
Databases
Feature StoreModel Registry
Tools
KubernetesTerraformDockervLLMTriton Inference ServerTGIPrometheusGrafana
Other
AWS/GCP/AzureGPU ManagementCI/CD/CT PipelinesRBAC
Apply Now