Главная /Блог/Case Study Zoom-In: How We Optimized Video Streaming from Device to iOS App over Wi-Fi/
Case Study Zoom-In: How We Optimized Video Streaming from Device to iOS App over Wi-Fi
Case Study Zoom-In: How We Optimized Video Streaming from Device to iOS App over Wi-Fi

Video streaming has emerged as a powerful functionality for many mobile applications and use cases. However, building an application that allows for this feature requires meticulous planning and a unique technology blend that fits your exact needs.

Today, we’ll tell you how an optimal selection of technologies has helped our client improve the quality of in-app video broadcasting and reduce its latency.

Background

Let’s set some context first. Our client is a manufacturer of observation optical devices for amateur and professional use. The client’s solution had a video streaming feature that allowed for unit-to-phone video streaming via Wi-Fi.

However, the existing solution had some flaws that degraded the user experience and depleted the phone’s battery. That was one of the tasks the Orangesoft team had to tackle — optimizing video streaming from an optical device to an iOS app.

But there was one tricky thing that made this task more challenging. The Wi-Fi-enabled device didn’t support protocols with automatic video stream format detection. Instead, it sent binary data over the UDP socket. The binary data represents a video stream in H.264 format using the RTP protocol. This protocol has a start and end identifier for each video packet. When all video packets are processed, they are decoded into a video frame.

Stop #1: Analysis of the existing tech solution

The client’s solution relied on VLC Media Player as a medium for communication between the source and the receiver. The VLC library is known for its versatility, but in this particular case, it led to the following issues:

  • Large app size — the VLC framework supports a wide variety of streaming protocols, adding to the application’s size.
  • High latency of about 1,500 ms, leading to freezes and lag.

Besides, VLC Media Player is a library with a “black box” nature, making it difficult to look under the hood and configure the parameters. Nevertheless, our team managed to reduce the frame delay to 1,200 ms.

Our developers used two smartphones to measure the delay. They ran a timer on one smartphone while the other was used for livestreaming the count-up. They took a picture of the two displays and calculated the difference between the real picture and the live-streamed version.

However, a frame delay of 1,200 ms was still a far cry from the client’s requirements, as the application needed a latency of no more than 500 ms for an optimized user experience. So, our team decided to take a different path.

Stop #2: Switching from VLC to the FFmpeg library

During research and analysis, our team realized that configuring the VLC library couldn’t eliminate the root cause of the frame delay. As the VLC Media Player is based on the open-source FFmpeg library, it was only logical to drop VLC in favor of the FFmpeg library to remove the redundant dependencies.

So, we recompiled the FFmpeg library to support only the necessary streaming protocol and video codec. This allowed us to reduce the app to a fourth of its former size and lower the latency to 500 ms. With a bit more customization of a library’s build script, our developers managed to reduce the frame delay to 450 ms.

However, mobile testing revealed that the application put the phone at 40–50% CPU usage. With a CPU load this high, the application would become a battery hog. Since the FFmpeg library taps into the CPU’s capabilities without using GPUs, we decided to switch to GPU-optimized native encoders.

Stop #3: Custom RTP parsing for iOS

After eliminating third-party dependencies, our team was able to gain full control over the video streaming process. We could optimize and configure everything from data-receiving sockets to the last frame. Our developers used CocoaAsyncSocket, a reliable and powerful socket library for transferring video streams.

We integrated multiple sockets to establish the connection between the devices. The TCP socket was responsible for sending control commands. That socket also guaranteed that all bytes sent reached the recipient. Response status codes were received via a different TCP port.

Conversely, the stream data was sent over UDP sockets that do not guarantee delivery of packets to the recipient. Therefore, it was crucial for the system to receive all packets from the moment the start header of the RTP frame was detected up until the frame’s end header was identified. If any packet is missing, the video frame cannot be decoded due to the data discontinuity.

To solve this, our team implemented full RTP protocol support and video stream display using the native AVFoundation framework and AVSampleBufferDisplayLayer. This allowed us to reduce the frame-processing latency to 350 milliseconds and lower CPU usage to 15–20% on iPhone XS.

Stop #4: Increasing the speed of video parsing with Metal

Our team proceeded with implementing parallel video frame rendering and transferring the load to the GPU via Metal technology. This would allow us to access the capabilities of the H264 hardware decoder.

For this purpose, we removed AVSampleBufferDisplayLayer and built a custom PlayerView based on the Metal technology. The media player encoded data directly in the GPU and quickly rendered video frames. This solution helped us achieve CPU usage of 4–5% on iPhone XS and shrink the application to one-tenth of its original size.

Here’s what the video streaming process looks like:

process (1).png

An example of code for GPU-accelerated image processing:

import MetalKit
import CoreMedia
import CoreVideo

final class PlayerViewEngine {
    private let device: MTLDevice
    private let commandQueue: MTLCommandQueue
    private var pipelineState: MTLRenderPipelineState?
    private var textureCache: CVMetalTextureCache?
    
    private let colorSpace = CGColorSpaceCreateDeviceRGB()
    
    private lazy var content: CIContext = {
        return CIContext(mtlDevice: device, options: [CIContextOption.workingColorSpace: NSNull()])
    }()

    init(device: MTLDevice) {
        self.device = device
        commandQueue = device.makeCommandQueue()!
        CVMetalTextureCacheCreate(kCFAllocatorDefault, nil, device, nil, &textureCache)
    }
    
    internal func render(to drawable: CAMetalDrawable, imageBuffer: CVImageBuffer) {
        guard let commandBuffer = commandQueue.makeCommandBuffer()
                else {
            return
        }
        
        let image = CIImage(cvImageBuffer: imageBuffer)
        let currentTexture = drawable.texture
        let drawableSize = CGSize(width: drawable.texture.width, height: drawable.texture.height)
        let drawingBounds = CGRect(origin: .zero, size: drawableSize)

        let scaleX = drawableSize.width / image.extent.width
        let scaleY = drawableSize.height / image.extent.height
        let scaledImage = image.transformed(by: CGAffineTransform(scaleX: scaleX, y: scaleY))

        content.render(scaledImage, to: currentTexture, commandBuffer: commandBuffer, bounds: drawingBounds, colorSpace: colorSpace)

        commandBuffer.present(drawable)
        commandBuffer.commit()
    }
}

The outcome

The project wasn’t an easy ride, but our developers came up with an optimal solution that helped us bring the client’s vision to life. The optimization techniques and solutions we mentioned above allowed us to build a reliable mobile application with seamless video streaming capabilities. The app has a stall-free video streaming mechanism that also allows it to add filters to the videos and recognize objects. Also, it is battery-friendly and has more processing resources for high-priority device-related tasks.

Read more: Stream Vision 2 case study

If your application is facing a similar problem or you just need an efficient video streaming solution, the Orangesoft team is ready to help. Tell us about your project, and we’ll take it from there.



19 September 2024
(0 оценок, ср. балл: 0 из 5)
Что думаете?
0 комментариев