Metal 2 on A11 — Overview
Metal 2 on A11 — Overview
Tech Talks
Metal 2
- GPU-driven rendering
- Platform feature alignment
- Machine learning acceleration
- Advanced optimization tools
Classical GPU Architecture
TBDR GPU Architecture
A11 GPU Architecture
Accelerated Rendering Techniques
- Deferred Rendering
- Tiled Forward
- Order Independent Transparency
- Multi-Layer Alpha Blending
- Sub-Surface Scatter
- MSAA Tone-mapping
- Custom Resolves
- Surface Aggregation
Metal 2 on A11
Advancing the TBDR Architecture
- Imageblocks
- Tile Shaders
- Imageblock Sample Coverage Control
- Raster Order Groups
- Threadgroup Sharing
Imageblocks
- 2D data structure accessible from shaders
- Single pixel access from fragment functions
- Full access from kernel functions - Multi-plane layout
- Efficient bulk store pixels to textures - Supports optional format conversion
Tile Shading
- Compute within a render pass
- Access to entire image block
- Access to threadgroup memory
Enhanced MSAA
- A11 GPU tracks unique sample data for even faster blending
- Imageblock Sample Coverage Control
- Access sample coverage tracking data
- Resolve at any time in your render pass
- Implement custom resolve in a tile shader
Raster Order Groups
- Access memory from overlapping fragment functions in submission order
- Allows fragment functions to communicate
- A11 GPU addtions:
- Support for Tile Shaders and Threadgroup Imageblock
- Support for multiple Raster Order Groups
Threadgroup Sharing
- Flexible and efficient sharing of data between threads
- Threadgroups can communicate with each other
- Threads within a threadgroup can also communicate without a barrier
- Use atomic operations or a memory fence
Additional Metal 2 Features
- More accurate f16 math
- Texture Cube Arrays
- Read / Write Texture
- Array of Samplers
- Post Depth Coverage
- Flexible Compute Dispatch
- Quad Scoped Permute Operations
Improved Compute Performance
- Up to 2x math performance on
- Computer Vision
- Image processing
- Machine learning