Meta/Facebook, AI and Systems
Position ID:
Position Title:
Research Scientist
Position Type:
Government or industry
Position Location:
Menlo Park, California 94025, United States of America
Subject Area:
Appl Deadline:
(posted 2024/11/02, listed until 2025/05/02)
Position Description:
The team, led by Chunqiang Tang (a.k.a. CQ Tang), consists of over 100 employees, mostly PhDs, including many world-class research scientists and engineers. As reflected in our team name "co-design", we conduct interdisciplinary research and development across AI, hardware, and software, with a focus on performance, efficiency, and scalability.
- We own the company's overall strategy for exploring innovative hardware technologies for CPUs, GPUs, memory, storage, and Meta's custom AI chips. We directly productionize them in Meta's hyperscale fleet of O(1,000,000) servers and O(100,000) GPUs, powering all Meta products such as Facebook, Instagram, and meta.ai.
- We apply novel software optimizations across the whole stack---from ML models and applications to the Linux kernel---to achieve optimal performance on the hardware.
- We develop innovative AI technologies for large language models (Llama), ranking systems, and more.
Here are selected publications that showcase our work in diverse areas.
AI chip and server design
Systems for AI
- The Llama 3 Herd of Models.
- Our contributions include much of the work described in the paper's Section 3.3 "Infrastructure, Scaling, and Efficiency", Section 6 "Inference", and Section 7.3 "Model Scaling".
- Llama 2: Open foundation and fine-tuned chat models.
- Our contributions include re-architecting Llama's training infrastructure and transitioning it from a research environment to Meta's hyperscale production infrastructure, enabling future Llama training to scale to tens of thousands of GPUs and beyond.
- Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models
- PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel
ML models and kernels
- Deep Learning Recommendation Model for Personalization and Recommendation Systems
- Wukong: Towards a Scaling Law for Large-Scale Recommendation
- FBGEMM: Enabling High-Performance Low-Precision Deep Learning Inference
ML numerics, pruning, distillation, and optimizer
- Microscaling Data Formats for Deep Learning INT4 Decoding GQA CUDA Optimizations for LLM Inference
- Pruning and Distillation to Enable Llama 3.2 1B and 3B Models Suitable for Mobile Devices
- PyTorch Distributed Shampoo Winning the MLCommon Training Algorithms Competition
HPC and collective communications library (MPI, NCCL, RCCL)
- Accelerating Communication in Deep Learning Recommendation Model Training with Dual-Level Adaptive Lossy Compression
- Training Deep Learning Recommendation Model with Quantized Collective Communications
Performance benchmarking and projection
- DCPerf: An open source benchmark suite for hyperscale compute applications
- DLRM: An advanced, open source deep learning recommendation model
Hardware and software co-design
Like research labs, our team consists primarily of PhDs, and we strongly encourage and excel in research publications. However, we differ from traditional research labs in several key ways:
- Production systems: Our primary goal is to develop forward-looking innovations in AI, hardware, and software, and directly implement them in production systems that serve billions of people. The billions of users of Meta products and Meta's hyperscale fleet of O(1,000,000) servers and O(100,000) GPUs are, in effect, our lab. In contrast, traditional research labs often rely on technology transfer for a less direct impact.
- Direct ownership: Like traditional research labs, we build strong partnerships with numerous teams across diverse areas for broad influence. However, what sets us apart is our direct ownership of the hardware strategy for Meta's hyperscale fleet. This enables us to lead in many areas while fostering seamless partnerships in others.
- Impact: Our impact is widely recognized across the company. We drive Meta's hardware strategy to save billions of dollars, and directly develop innovative technologies in Meta's flagship products like Llama and Ads ranking models.
See our list of publications for more details.
We are not accepting applications for this job through AcademicJobsOnline.Org right now. Please apply at https://2024resumedropco-design.splashthat.com/ .
- Contact: Chunqiang Tang
- Email:
- Postal Mail:
- 1 Hacker Way, Menlo Park, CA 94025
- Web Page: https://aisystemcodesign.github.io/