Table of Contents
- Splitting Input Data
- Parameter Server and All-Reduce
- Building a Data Parallel Training and Serving Pipeline
- Bottlenecks and Solutions
- Splitting the Model
- Pipeline Input and Layer Split
- Implementing Model Parallel Training and Serving Workflows
- Achieving Higher Throughput and Lower Latency
- A Hybrid of Data and Model Parallelism
- Federated Learning and Edge Devices
- Elastic Model Training and Serving
- Advanced Techniques for Further Speed-Ups

