Video

Measure and Adjust Microphone Latency in Ecamm Live

If you have a delay in your video, meaning, you hear your audio first before the video, then you are running into latency. No one wants to look like a badly dubbed Kung Fu movie. Here’s how to handle it with the audio preferences in Ecamm Live.

Audio-to-video synchronization (also known as lip sync, or by the lack of it: lip sync errorlip flap) refers to the relative timing of audio (sound) and video (image) parts during creation, post-production (mixing), transmission, reception and play-back processing. AV synchronization can be an issue in televisionvideoconferencing, or film.

In industry terminology the lip sync error is expressed as an amount of time the audio departs from perfect synchronization with the video where a positive time number indicates the audio leads the video and a negative number indicates the audio lags the video. This terminology and standardization of the numeric lip sync error is utilized in the professional broadcast industry as evidenced by the various professional papers,standards such as ITU-R BT.1359-1, and other references below.

Digital or analog audio video streams or video files usually contain some sort of synchronization mechanism, either in the form of interleaved video and audio data or by explicit relative timestamping of data. The processing of data must respect the relative data timing by e.g. stretching between or interpolation of received data. If the processing does not respect the AV-sync error, it will increase whenever data gets lost because of transmission errors or because of missing or mis-timed processing.

Why is there a delay on my live-stream?

This is a common question with people who are new to live streaming and to the participants of live streams and they wonder why there’s a 30 second delay when watching themselves on their favorite platform.

Live Streaming Latency

Even live TV has a delay. Most people never see it because they haven’t been on set of a TV studio. There is always a slight delay even with live TV because there is some processing of video before it hits the air. There’s even more delay online!

HDMI Latency

Technically it isn't an HDMI issue, some cameras just don't have the processing power for real-time video.

CPU Processing

Depending on your live-switcher. There is additional processing added to the signal to scale, add graphics, etc to the video that will be sent to your CDN (Content Delivery Nework).

Let's go over a few definitions you should be familiar with for streaming video.

Below are a few streaming terms that everyone should be familiar with even though it can be a little overwhelming. We’ve defined them so you can reference them.

Latency

What most people refer to as "delay" - it's the amount of time that happens in the “real world” and the display of that event on the viewer’s screen.

Buffering

Before a video can play, a certain amount of pre-loading data must be downloaded to stream.

Content Delivery Network (CDN)

A distribution system on the Internet that accelerates the delivery of Web pages, audio, video, and other Internet-based content to users around the world.

Embedded Audio:

The audio signal is sent to the output source through the video signal. This workflow is recommended to avoid audio/video sync issues. For example, a microphone is plugged into a camera instead of the encoder.

Transcoding

the process of decoding an incoming media stream, changing one or more of its parameters (e.g. codec, video size, sampling rate, or encoder capabilities), and re-encoding it with the new parameter settings.

Video Distribution Service (VDS)

though a VDS can take many forms, it is essentially responsible for taking one or more incoming streams of video and audio (from a broadcaster) and presenting it to viewers. This includes what is commonly referred to as a Content Delivery Network.

Why Does Latency Happen?

It comes down to physics. Video has miles to cover between the camera and ultimately the screen of the viewing device. There is a series of technical steps to get it there. After video is captured, they are converted a few seconds at a time into a format that can be sent across the Internet.

That video has to be processed into different qualities so it can be viewed smoothly on different devices from a laptop to an iPhone. All those versions are sent across multiple servers around the country.

Video Capture

Whether you’re using a single camera or a sophisticated video mixing system, taking a live image and turning it into digital signals takes some time. At minimum, it will take at least the duration of a single captured video frame (1/30th of a second for a 30fps frame rate).

More advanced systems such as video mixers will introduce additional latency for decoding, processing, re-encoding, and re-transmitting. Your video capture and processing requirements will determine this value.

Minimum: about 33 milliseconds

Maximum: hundreds of milliseconds

ATEM Mini Pro Switcher

Capture Card

When encoding in software (on a PC or Mac) or using a hardware encoder (Camlink or Magewell card), it takes time to convert the video signal into a compressed format suitable for transmission across the Internet. This latency can range from extremely low (thousandths of a second) to values closer to the duration of a video frame. Changing encoding parameters can lower this value at the expense of encoded video quality.

Minimum: about 1 millisecond

Maximum: about 40-50 milliseconds

Magewell Capture Card

Transmission to Facebook or YouTube Servers

The encoded video takes time to transmit over the Internet to a CDN. This latency is affected by the encoded media bitrate (lower bitrate usually means lower latency), the latency and bandwidth of the internet connection, and the proximity (over the Internet) to the CDN.

Minimum: about 5-10 milliseconds

Maximum: hundreds of milliseconds

Server Transcoding

Your viewers will be watching from many kinds of devices (PCs, Macs, tablets, phones, TVs, and set-top boxes) over many types of networks (LAN/WiFi, 5G LTE, 4G, etc.). In order to provide a quality viewing experience across a range of devices, a good streaming provider should provide an optimized stream.

There are two general ways to accomplish this: either the encoder streams multiple quality levels to the CDN (which are directly relayed to viewers), or the encoder sends a single high-quality stream to the CDN, which then transcodes and transrates it to multiple levels. Typically, the transcoding and transrating takes about as long as a “segment” of encoded video (more about segments later), but it can be faster at smaller resolutions and lower bitrates.

Minimum: about 1 second

Maximum: about 10 seconds

Why doesn't Zoom or Skype have a delay?

There’s a difference between”live conferencing” (FaceTime, Skype, Zoom) and “streaming” platforms like YouTube or Facebook Live. The biggest difference is how the content is consumed. Live streaming is typically one to many, vs conferencing is more two-way communication with limited participants.

The difference may seem trivial, but it is very important when the number of participants or viewers scales to a large number.

Collaboration requires specialized coding and computing services to reduce the delay between participants. These do not scale well to large numbers of participants. That is why there’s typically a limit to the amount of participants in conferencing software.

Transmission to your viewers

Each time you watch a live stream or video on demand, streaming protocols are used to deliver data over the internet. These can sit in the application, presentation, and session layers.

Online video delivery uses both streaming protocols and HTTP-based protocols. Streaming protocols like Real-Time Messaging Protocol (RTMP) enable speedy video delivery using dedicated streaming servers, whereas HTTP-based protocols rely on regular web servers to optimize the viewing experience and quickly scale. Finally, a handful of emerging HTTP-based technologies like the Common Media Application Format (CMAF) and Apple’s Low-Latency HLS seek to deliver the best of both options to support low-latency streaming at scale.

How can I reduce latency?

You can do a lot to reduce the latency of your live streams simply by changing encoder settings, internet service providers, or the type of connection.

Some attributes of your total latency may be within your control like bandwidth, encoding, or video format. Your encoder settings, the jitter buffer, the transcoding and transrating profiles, and segment duration can also be configurable. Keep in mind, however, that while a lower latency may sound desirable, it’s important to test these settings with great caution, as each choice may bring about other negative consequences.