Metal for Accelerating Machine Learning
Metal for Accelerating Machine Learning
WWDC 2018
Metal Performance Shaders
GPU-accelerated primitives, optimized for iOS and macOS
- Image processing
- Linear algebra
- Machine learining
- inference
- training (new) - Ray tracing (new)
Training
Inference
CNN Inference Enhancements
FP16 accumulation
- Available with Apple A11 Bionic GPU for
- Convolution
- Convolution transpose - Sufficient precision for commonly used neural networks
- Delivers better performance than FP32
CNN Training
Training
Forward Pass
Loss computation
Gradient pass
Weight update
Iterate
- Forward pass → Loss computation → Gradient pass → Weight update
Training a Neural Network with MPS
- Create training graph
- Prepare inputs
- Specify wights
- Execute graph (Graph updates wights)
- Complete training process
Create Training Graph
- Describe neural network using graph API
- Image nodes — Data
- Filter nodes — Operations
Create an Inference Graph
Prepare Inputs
- Inputs to the graph
- Batch of source images
- Batch of source states
Batches
- Batches are arrays of images or states
States
MPSState
passes state of forward node to gradient node- Graph manages all states
Loss Labels
Data Source Providers
- Convolution
- Fully Connected
- Batch normalization
- Instance normalization
- Just-in-time loading and purging of weights data
- Minimize memory footprint
Execute graph
Updating Weights
- Implement optional update method on Data Source Provider
- Graph calls update method automatically
Optimizer
- Describe how to take update step on training parameters
- Used in update method of Data Source Provider
- Variants
-MPSNNOptimizerAdam
-MPSNOptimizerStochasticGradientDescent
-MPSNNOptimizerRMSProp
- Custom
Complete training process
Demo
CNN
1 to 1
RNN
- 1 to Many
- Many to Many
Recurrent Neural Networks
Variants for inference and training (new)
- Single Gate
- Long Short-Term Memory (LSTM)
- Gated Recurrent Unit (GRU)
- Minimally Gated Unit (MGU)
Activity Classifier
Inference
Training
Data Converters
Demo
Object classification training using TensorFlow with MPS