Machine Learning Compiler Engineer
About the job
Machine Learning Compiler Engineer
For a high growth Series A Deep Tech Company with over $50 million in funding, we’re seeking a Machine Learning Compiler Engineer to join their growing team.
Responsibilities:
Lower deep learning graphs – from common frameworks (PyTorch, Tensorflow, Keras, etc) down to an IR representation for training – with particular focus on ensuring reproducibility
Write novel algorithms – for transforming intermediate representations of compute graphs between different operator representations.
Ownership – of two of the following compiler areas:
- Front-end: Integrate common Deep Learning Frameworks with our internal IR, and implement transformation passes in ONNX to adapt IR for middle-end consumption.
- Middle-end: Design compiler passes for training-based compute graphs, integrate reproducible Deep Learning kernels into the code generation stage, and debug compilation passes and transformations.
- Back-end: Translate IR from the middle-end to GPU target machine code.
Required Skills:
- Fundamental knowledge of traditional compilers (e.g., LLVM, GCC) and graph traversals necessary for compiler code development.
- Strong software engineering skills, demonstrated by contributing to and deploying production-grade code.
- Understanding of parallel programming, particularly concerning GPUs.
- Willingness to learn Rust, as it is our company’s default programming language.
- Ability to operate with High-Level IR/Clang/LLVM up to middle-end optimization, and/or Low-Level IR/LLVM targets/target-specific optimizations, especially GPU-specific optimizations.
- Highly self-motivated with excellent verbal and written communication skills.
- Comfortable working independently in an applied research environment.
Bonus Skills:
- Thorough understanding of computer architectures specialized for training neural network graphs (e.g., Intel Xeon CPU, GPUs, TPUs, custom accelerators).
- Experience in systems-level programming with Rust.
- Contributions to open-source Compiler Stacks.
- In-depth knowledge of compilation in relation to High-Performance Computer architectures (CPU, GPU, custom accelerator, or a heterogeneous system).
- Strong foundation in CPU and GPU architectures, numeric libraries, and modular software design.
- Understanding of recent architecture trends and fundamentals of Deep Learning, along with experience with machine learning frameworks and their internals (e.g., PyTorch, TensorFlow, scikit-learn, etc.).
- Exposure to Deep Learning Compiler frameworks like TVM, MLIR, TensorComprehensions, Triton, JAX.
- Experience in writing and optimizing highly-performant GPU kernels.
We are excited to hear from you
Please apply for more details on this wonderful opportunity!