قالب وردپرس درنا توس
Home / IOS Development / Metal video processing for iOS and tvOS

Metal video processing for iOS and tvOS



Real-time video processing is a special case of digital signal processing. Technologies like Virtual Reality (VR) and Augmented Reality (AR) strongly relate to real-time video processing to extract semantic information from each video frame and use it for object detection and tracking, face detection and other data sync techniques. [19659002] Real-time video processing of a mobile device is a rather complex task, due to the limited resources available on smartphones and tablets, but you can achieve great results when using the right techniques.

In this post I will show you how to process a real-time video using the metal frame that utilizes the power of the GPU. In one of our previous posts, you can check the details of how to install Metal rendering pipeline and process calculate shaders for image processing. Here we will do something similar, but this time we will process video frames.

AV Foundation

Before continuing with the implementation of video editing in Metal, let's take a quick look at the AV Foundation framework and the components we need to play a video. In a previous post, I showed you how to use the AV Foundation to record video with your iPhone or iPad. Here we will use another set of AV basic classes to read and play a video file on an iOS or tvOS device.

You can play a video on iPhone or Apple TV in different ways, but for the purpose of this post, I will use the AVPlayer class and 1

945901010 to extract each video frame and send them to Metal for real-time processing on GPU.

An AVPlayer is a control object used to manage playback and timing of a media fund. You can use an AVPlayer to play local and external file-based media, such as video and audio files. In addition to the default controls to be played, pause, change the playback speed and search for different times within the media timeline, an AVPlayer object allows access to each frame of a video field through an AVPlayerItemVideoOutput object. This object returns a reference to a Core Video Pixel Buffer (a ] type of CVPixelBuffer ]. Once you get the pixel buffer, you can then convert it to a Metal texture and thus process it on the GPU.

Creating an AVPlayer is very simple. You can either use the file address of the video or an AVPlayerItem object. So, to initialize an AVPlayer use one of the following init methods:

A AVPlayerItem stores a reference to an AVAsset object representing the media to be played. A AVAsset is an abstract, immutable class used to model time-based audiovisual media such as video and audio. Since AVAsset is an abstract class, you can not use it directly. Instead, use one of the 2 subclasses that comes with the frame. You can choose from a AVURLAsset and an AVComposition (or AVMutableComposition . A AVURLAsset is a specific subclass of AVAsset that you can use for to a AVComposition allows you to combine media data from multiple file-based sources in a custom temporary event or its rendered subclass AVMutableComposition .

In this post I will use AVURLAsset The following source code highlights how all of these AV basic classes can be combined:

To extract the frames from video file while the player is playing, you must use an item AVPlayerItemVideoOutput . When you get a video frame, you can use Metal to process it on the GPU. Let's now build a good example to demonstrate it.

Video Processor App

Create a new Xcode project. Select an iOS Single View application and name it VideoProcessor . Open ViewController.swift file and import AVFoundation .

Since we need an AVPlayer we add the following property to the display regulator:

As discussed above, the player gives access to each video frame through an AVPlayerItemVideoOutput object. Then let's add an additional feature for the display control: