OpenAI, LLMs, WebRTC, voice bots and Programmable Video
Learn about WebRTC LLM and its applications. Discover how this technology can improve real-time communication using conversational AI.
Read MoreWebRTC video quality requires some tweaking to get done properly. Lets see what levels we have in the form of bitrate, resolution and frame rate available to us.
Real time video is tough. WebRTC might make things a bit easier, but there are things you still need to take care of. Especially if what you’re aiming for is to squeeze every possible ounce of WebRTC video quality for your application to improve the user’s experience.
This time, I want to cover what levers we have at our disposal that affect video quality - and how to use them properly.
Video plays a big role in communication these days. A video call/session/meeting is going to heavily rely on the video quality. Obviously…
But what is it then that affects the video quality? Lets try and group them into 3 main buckets: out of our control, service related and device related. This will enable us to focus on what we can control and where we should put our effort.
There are things that are out of our control. We have the ability to affect them, but only a bit and only up to a point. To look at the extreme, if the user is sitting in Antarctica, inside an elevator, in the basement level somewhere, with no Internet connection and no cellular reception - in all likelihood, even if he complains that calls aren’t get connected - there’s nothing anyone will be able to do about it besides suggesting he moves himself closer to the Wifi access point.
The main two things we can’t really control? Bandwidth and the transport protocol that will be used.
👉 We can’t control the user’s device and its capabilities either, but most of the time, people tend to understand this.
Bandwidth is how much data can we send or receive over the network. The higher this value is, the better.
The thing is, we have little to no control over it:
None of this is in our control.
And while we can do minor things to improve this, such as positioning our servers as close as possible to the users, there’s not much else.
Our role with bandwidth is to as accurately as possible estimate it. WebRTC has mechanisms for bandwidth estimation. Why is this important? If we know how much bandwidth is available to us, we can try to make better use of it -
👉 Over-estimating bandwidth means we might end up sending more than the network can handle, which in turn is going to cause congestion (=bad)
👉 Under-estimating bandwidth means we will be sending out less data than we could have, which will end up reducing the media quality we could have provided to the users (=bad)
I’ve already voiced my opinion about using TCP for WebRTC media and why this isn’t a good idea.
The thing is, you don’t really control what gets selected. For the most part, this is how the distribution of your sessions is going to look like:
Why is that? Just because networks are configured differently. And you have no control over it.
👉 You can and should make sure the chart looks somewhat like this one. 90% of the sessions done over TURN/TCP should definitely raise a few red flags for you.
But once you reach a distribution similar to the above, or once you know how to explain what you’re seeing when it comes to the distribution of sessions, then there’s not much else for you to optimize.
Service related are things that are within our control and are handled in our infrastructure usually.This is where differentiation based on how we decided to architect and deploy our backend will come into play.
While bandwidth isn’t something we can control, bitrate is. Where bandwidth is the upper limit of what the network can send or receive, bitrate is what we actually send and receive over the network.
We can’t send more than what the bandwidth allows, and we might not always want to send the maximum bitrate that we can either.
Our role here is to pick the bitrate that is most suitable for our needs. What does that mean to me?
👉 It is important to remember to understand that increasing bitrate doesn’t always increase quality. It can cause detrimental decreases in quality as well.
Here are a few examples:
There are a lot of other such cases as well.
So what do we do? I know, I am repeating myself, but this is critical -
Codecs affect media quality.
For voice, G.711 is bad, Opus is great. Lyra and Satin look promising as future alternatives/evolution.
With video, this is a lot more nuanced. You have a selection of VP8, VP9, H.264, HEVC and AV1.
Here are a few things to consider when selecting a video codec for your WebRTC application:
👉 Choosing a video codec for your service isn’t a simple task. If you don’t know what you’re doing, just stick with VP8 or H.264. Experimenting with codecs is a great time waster unless you know your way with them.
While we don’t control where users are - we definitely control where our servers are located. Which means that we can place the servers closer to the users, which in turn can reduce the latency (among other things).
Here are some things to consider here:
👉 Measure the latency of your sessions (through rtt). Try to reduce it for your users as much as possible. And assume this is an ongoing never-ending process
Here’s a session from Kranky Geek discussing latencies and media servers:
There’s a lot to be said about the infrastructure side in WebRTC. I tried to place these insights in an ebook that is relevant today more than ever - Best practices in scaling WebRTC deployments
You don’t get to choose the device your users are going to use to join their meetings. But you do control how your application is going to behave on these devices.
There are several things to keep in mind here that are going to improve the media quality for your users if done right on their device.
This should be your top priority. To understand how much CPU is being used on the user’s device and deciding when you’ve gone too far.
What happens when the device is “out of CPU”?
See also our article about the video quality metrics in WebRTC.
So what did we have here?
👉 You end up with poor video quality and video freezes
👉 The network gets more congested due to frequent requests for I-frames
👉 Your device heats up and battery life suffers
Your role here is to monitor and make sure CPU use isn’t too high, and if it is, reduce it. Your best tool for reducing CPU use is by reducing the bitrates you’re either sending and/or receiving.
Sadly, monitoring the CPU directly is impossible in the browser itself and you’ll need to find out other means of figuring out the state of the CPU.
With video, content and placement matter.
Let's say you have 1,000kbps of “budget” to spend. That’s because the bandwidth estimator gives you that amount and you know/assume the CPU of both the sender and receiver(s) can handle that bitrate.
How do you spend that budget?
WebRTC makes its own decisions. These are based on the bitrate available. It will automatically decide to increase or reduce resolution and frame rate to accommodate for what it feels is the best quality. You can even pass hints on your content type - do you value motion over sharpness or vice versa.
There are things that WebRTC doesn’t know on its own through:
It is going to be your job to figure out these things and place/remove certain restrictions of what you want from your video.
The bigger the meeting the more challenging and optimized your code will need to be in order to support it. WebRTC gives you a lot of powerful tools to scale a meeting, but it leaves a lot to you to figure out. This ebook will reveal these tools to you and enable you to increase your meeting sizes - Optimizing Group Video Calling in WebRTC
Video quality in WebRTC is like a 3-legged stool. With all things considered equal, you can tweak the bitrate, frame rate and resolution. At least that’s what you have at your disposal dynamically in real-time when you are in the middle of a session and need to make a decision.
Bitrate can be seen as the most important leg of the stool (more on that below).
The other two, frame rate and resolution are quite dependent on one another. A change in one will immediately force a change in the other if we wish to keep the image quality. Increasing or decreasing the bitrate can cause a change in both frame rate and resolution.
I see a lot of developers start tweaking frame rates or resolutions. While this is admirable and even reasonable at times, it is the wrong starting point.
What you should be doing is follow the bitrate in WebRTC. Start by figuring out and truly understanding how much bitrate you have in your budget. Then decide how to allocate that bitrate based on your constraints:
Always start with bitrate.
Then figure out the constraints you have on resolution and frame rate based on CPU, devices, screen resolution, content type, ... and in general on the context of your session.
The rest (resolution and frame rate) should follow.
And in most cases, it will be preferable to “hint” WebRTC on the type of content you have and let WebRTC figure out what it should be doing. It is rather good at that, otherwise, what would be the point of using it in the first place?
Once we have the bitrate nailed down - should you go for a higher resolution or a higher frame rate?
Here are a few guidelines for you to use:
I’ve had my fair share of discussions lately with vendors who were working with WebRTC but didn’t have enough of an understanding of WebRTC. Often the results aren’t satisfactory, falling short with what is considered good media quality these days. All because of wrong assumptions or bad optimizations that backfired.
If you are planning to use WebRTC or even using WebRTC, then you should get to know it better. Understand how it works and make sure you’re using it properly. You can achieve that by enrolling in my WebRTC training courses for developers.
Learn about WebRTC LLM and its applications. Discover how this technology can improve real-time communication using conversational AI.
Read MoreGet your copy of my ebook on the top 7 video quality metrics and KPIs in WebRTC (below).
Read More