Metal for OpenGL Developers
Metal for OpenGL Developers
WWDC 2018
- These legacy APIs are deprecated
- Still available in iOS 12, macOS 10.14 and tvOS 12
- Begin transitioning to Metal
Choosing an Approach
- High-level Apple frameworks
- SpriteKit, SceneKit, and Core Image - 3rd-party engine
- Unity, Unreal, Lumberyard, etc
- Update to lastest version
Challenges with OpenGL
- OpenGL designed more than 25 years ago
- Core architecture reflects the origin of 3D graphics
- Extensions retrofitted some GPU feature - Fundamental design choices based on past principles
- GPU pipeline has changed
- Multithreaded operation not considered
- Asynchronous processing, not core
Design Goals for Metal
- Efficient GPU interaction
- Low CPU overhead
- Multithreaded execution
- Predictable operation
- Resource and synchronization control - Approachable to OpenGL developers
- Built for modern and Apple-design GPUs
Key Conceptual Differences
- Expensive operations less frequent
- Expensive CPU operations performed less often
- More GPU command generation during object creation
- Less needed when rendering - Modern GPU pipeline
- Reflects the modern GPU architectures
- Closer match yields less costly translation to GPU commands
- State grouped more efficiently - Multithreaded execution
- Designed for multithreaded execution
- Clear rules for multithreaded usage
- Cross thread object usability - Execution model
- True interaction between software and GPU
- Predictable operation allows efficient designs
- Thinner stack between application and GPU
Command Encoders
- Render Command Encoder
- Blit Command Encoder
- Compute Command Encoder
Render Command Encoder
- Commands for render pass
- Encodes a series of render commands
- Also called a Render Pass
- Set render object for the graphics pipeline ( Buffer, texture, shaders )
- Issue draw commands ( draw primitives, draw index primitives, instanced draws ) - Render targets
- Associated with a set of render targets ( Textures for rendering )
- Specify a set of render targets upon creation
- All draw commands directed to these for lifetime of encoder
- New render targets need a new encoder
- Clear delineation between sets of render targets
Render Object
- Textures
- Buffers
- Samplers
- Render pipeline states
- Depth stencil states
Render Object creation
- Create from a device
- Usable only on the device - Object state set at creation
- Descriptor object specifies properties for render object - State set at creation fixed for the lifetime of the object
- Image data of textures and values in buffers can change - Metal compiles objects into GPU state once
- Never needs to check for changes and recompile - Multithreaded usage more efficient
- Metal does not need to protect state from changes on other threads
Metal Porting
Metal Shading Language
- Based on C++
- Classes, templates, structs, enums, namespaces - Built-in types for vectors and matrices
- Built-in functions and operators
- Built-in classes for textures and samplers
SIMD Type Library
Types for shader development
- Vector and matrix types
- Usable with Metal shading language and application code
Shader Compilation
Build with Xcode
- Xcode compiles shaders into a Metal library (.metallib)
- Front-end compilation to binary intermediate representation
- Avoids parsing time on customer systems
- By default, all shaders built into default.metallib
- Placed in app bundle for run time retrieval
Runtime Shader Compilation ⚠️
- Also can build shaders from source at runtime
- Significant disadvantages
- Full shader compilation occurs at runtime
- Compilation errors less obvious
- No header sharing between application and runtime built shaders - Build time compilation recommended
Devices
- A device represents one GPU
- Create render objects
- Texture, buffers, pipelines - macOS multiple devices may be available
- Default device suitable for most applications
Command Queues
- Queue created from a device
- Queue execute command buffers in order
- Create queue at initialization - Typically one queue sufficient
Texture
Storage Modes
Texture Differences ⚠️
- Sampler state never part of texture
- Wrap modes, filtering, min/max LOD - Texture image data not flipped
- OpenGL uses bottom-left origin, Metal uses top-left origin - Metal does not perform format conversion
Buffers
- Metal uses buffers for vertices, indices, and all uniform data
- OpenGL’s vertex, element, and uniform buffer are similar
- Easier to port apps that have adopted these
Notes About Buffer Data ⚠️
!!! Pay attention to alignment !!!
- SIMD libraries vector and matrix types follow same rules as Metal shaders
- Special packed vector types available to shaders
-packed_float3
consumes 12 bytes
-packed_half3
consumes 6 bytes - Cannot directly operate on packed types
- Cast to non-packed type required
Storage Modes for Porting ⚠️
- Use most convient storage modes
- Easier access to data - On iOS
- Create all textures and buffers withMTLStorageModeShared
- On macOS
- Create all textures withMTLStorageModeManged
- Make judicious use ofMTLStorageModeShared
for buffers
Separate GPU only data from CPU accessible data
MetalKit
Texture and buffer utilities
- Texture Loading
- Textures from KTX, PVR, JPG, PNG, TIFF, etc - Model Loading
- Vertex buffers from USD, OBJ, Alembic, etc
PipeLines
Pipeline Differences
Pipeline Building
- Create at initialization
- Full compilation key advantage of state grouping
- Choose a canonical vertex layout for meshes
- Use a limited set of render target formats - Lazy creation at draw time ⚠️
- Store pipeline state objects in a dictionary using descriptor as key
- Construct descriptor at draw time with current state
- Retrieve existing pipeline from dictionary OR build new pipeline
Create Render Objects at Initialization
- Object creation expensive
- Pipelines require backend compilation
- Buffers and textures need allocations - Once created, much faster usage during rendering
Command Buffers
- Explicit control over command buffer submission
- Start with one command buffer per frame
- Optionally split a frame into multiple command buffers to
- Submit early and get the GPU started
- Build commands on multiple threads - Completion handler invoked when execution is finished
Resource Updates
- Resources are explicitly managed in Metal
- No implicit synchronization like OpenGL
- Allows for fine gained synchronization
- Application has complete control
- Best model dependent on usage
- Triple buffering recommended
Problem
Temporary Solution ⚠️
Synchronous wait after every frame
Triple Buffering
Shared buffer pool 👍
Render Encoders
Render Pass Descriptor
Render Pass Descriptor
Render Pass Setup
ender Pass Load and Store Actions
Rendering with OpenGL
Rendering with Metal
Display
Drawables and presentation
- Drawables — Textures for on screen display
- Each frame MTKView provides
- Drawable texture
- Render pass descriptor setup with the drawable - Render to drawables like any other texture
- Present drawable when down rendering
Incrementally Porting ⚠️
- Create shared Metal/OpenGL textures using IOSurface or CVPixelBuffer
- Render to texture on 1 API and read in the other - Can enabled mixed Metal/OpenGL applications
- Sample Link
Multithreading
Metal is designed to facilitate Multithreading
- Consider multithreading if application is CPU bound
- Encode multiple command buffers simultaneously
- Split single render pass using MTLParallelCommandEncoder
Staying on the GPU
- Metal natively supports compute
- Performance benefits
- Reduces CPU utilization
- Reduces GPU-CPU synchronization points
- Free’s data bandwidth to the GPU - New algorithms possible
- Particle systems, physics, object culling
Developer Tools
Debug and optimize your applications
- Xcode contains an advances set of GPU tools
- Enable Metal’s API vaildation layer
- On by default when target run from Xcode