I
Ifm-usEngineering

Senior Distributed Systems Engineer

Sunnyvaleonsitesenior

Posted 2mo ago · via Lever

About this role

About the Institute of Foundation Models The Institute of Foundation Models (IFM) designs and operates ultra-scale GPU supercomputing systems to train next-generation foundation models. We believe performance, fault tolerance, and scalability are co-designed across model architecture, communication systems, runtime, and hardware topology. This role sits at the core of that effort — driving communication performance, distributed reliability, and cross-layer optimization for large-scale training workloads. The Mission We are looking for a deeply technical engineer to co-design and optimize the communication stack for large-scale distributed training, including hybrid parallelism and Mixture-of-Experts (MoE) workloads. This is not a network operations role.…

Read the full description on Ifm-us's site →

What we'd score you on

reqspace match rubric

Five dimensions, recruiter-grade. Upload your resume and we'll generate a written explanation of where you fit and where the gaps are.

1

Skills match

For this role: go, rust, c++, pytorch, github

2

Level fit

This role is senior-level. We check your trajectory against it.

3

Domain experience

Your work in the role's domain matters more than your years total. We weight recent and direct experience.

4

Recency

A skill you used last quarter weighs more than one from five years ago. We grade on recency, not lifetime.

5

Location fit

This role is based in Sunnyvale. We weight your proximity and willingness to relocate.

Score yourself on this role.
Free · no card · written explanation included
See if I'm a fit →

Skills in this role

Pulled from the job description. These are the keywords we'll weight when scoring your fit.

gorustc++pytorchgithub

More at Ifm-us

See all open jobs at Ifm-us