Object Tracking in Vision

Object Tracking in Vision

WWDC 2018

Object Tracking in Vision

WWDC 2018

Vision in a Nutshell

One stop for solving computer vision problems
Simple, consistent interface
Runs on iOS, macOS, and tvOS
Privacy-oriented
Continuously evolving

Vision Basics

Requests

Request Handlers

Observations

Family of classes derived from VNObservation
How to obtain a VNObservation ?
- Returned in VNRequest results property
- Can be manually created

New Face Detector

Finds more faces
Now orientation-agnostic

VNDetectFaceRectanglesRequest (Sample API)
VNDetectFaceRectanglesRequestRevision2
VNFaceObservation has 2 new properties

Request Revisioning

Vision Request now support revisioning
Future-proof your app — Error for unavailable functionality

Image Request Handler

Used to process one or more requests on the same image
Optimizes performance by caching image derivatives and request results

Sequence Request Handler

Processes requests on the sequence of images
Used to process 2 types of requests — Tracking and Image Registration

VNRequest Initialization

Mandatory — Must be provided via initializer, overriding is OK

Optional — Initialized to default value, overriding is OK

Understanding Results

Collection of VNObservation objects in VNRequest results property

The number of observations is from 0~n
VNObservation is immutable
Important common observation properties:
- uuid — is used to match related results
- confidence — Shows quality of returned results

Request Pipelines

Pipeline — requests are executed to fullfill dependency

Lifecycle Management

How long to keep objects in memory?

Image Request Handler (While the image needs processing)
Sequence Request Handler (While the sequence needs processing)
Requests/Observations (Lightweight objects, create/release as needed

Where to Process Your Requests?

Many requests in Vision rely on Neural Networks
Neural Networks usually run faster on GPUs
Vision can run requests on both CPU and GPU
- Default: Use GPU, switch to CPU if GPU is busy
- Explicit: Set VNRequest usesCPUOnly to true

Tracking in General

Object of interest: Auto-detected or manually selected
Sequence of frames: Camera feed
Tracking: Look for the object of interest
Applications: Focus tracking with camera

Why Tracking and Not Detection?

No specific detectors for all objects
Need to match detected objects
Trackers use temporal information
Speed — Trackers are faster
Trackers are smoother, not as jittery

Tracking Types in Vision

Demo

Sample Link

Tracking in Vision

Initial object of interest selection
- Automatic: By running an appropriate detector
- Manual: User input
One tracking request per tracked object (1:1
2 Types: VNTrackObjectRequest , VNTrackRectangleRequest
Tracking algorithm: trackingLevel = .fast / .accurate
Tracking quality: Use observation confidence property
How many objects can we track simultaneously?
- Limit: 16 trackers of each type at a time
- Error is returned if over limit
How to release a tracker?
- Request’s property lastFrame = true
- Release VNSequenceRequestHandler

Tracking Challenges

Fast or accurate ?
Initial bounding box location, use salient object
Which confidence level threshold to use?
Consider rerunning detectors every N frames

Objects change their shape, size, appearance, color, …

← Previous Post Next Post →