OpenAI, LLMs, WebRTC, voice bots and Programmable Video
Learn about WebRTC LLM and its applications. Discover how this technology can improve real-time communication using conversational AI.
Read MoreAV1 is coming to WebRTC sooner rather than later. Apparently so is HEVC. It is an AV1 vs HEVC game now, but sadly, these codecs are unavailable to the “rest of us”.
WebRTC codec wars were something we’ve seen in the past. During the early days of WebRTC there have been ongoing discussions if the mandatory video codec in WebRTC should be VP8 or H.264. The outcome was to have both of them mandatory to implement in browsers.
Fast forward to today, and life is simply. We have ubiquity and support across all browsers that have WebRTC in them, which is great.
We are now gearing up for the next fight. This one isn’t going to be between VP9 and HEVC, but rather between AV1 and HEVC.
If you are looking to learn more about the differences between the WebRTC codecs, be sure to read this article as well: 🎲 Which video codec to use in your WebRTC application? 🎲
COVID-19 is causing all communication vendors to fast forward and accelerate their roadmaps by 6-18 months. Those that don’t are going to be left behind on the other side of this pandemic.
This isn’t an attempt to scare anyone or to FUD people into doing things. It is just the way things are.
If you want to see how serious things are, just check what’s going on around you:
The AV1 vs HEVC angles here are VERY interesting.
HEVC requires royalties and is a licensing mess.
AV1 is so new it hasn’t even had an opportunity to cool down a bit after being taken out of the oven. Frankly? It is still half baked and requires a bit more cookin’ - and yet… it is now being rolled out in Google Duo.
The thing is, that 6 months back, video was nice to have. A feature that needs to be ticked in a long requirements list.
Today? Video first. All the rest comes later.
Zoom’s stock price and market cap is the best indicator of that change.
In less than 10 years, we’ve witnessed 3 codec generations in WebRTC:
With each generation of codec introduced, CPU and memory requirements grow along with the complexity of the codec and the resulting quality for a given bitrate increases.
I’ve been working with H.264 since 200x. Probably somewhere in 2005. It was brand new at the time and was about to replace H.263 and all of its extensions.
Fast forward to around 2010, when you started it being deployed in almost all video conferencing room systems.
VP8 came to our lives along with WebRTC, in around 2012. It is comparable to H.264.
There are reasons to pick H.264 over VP8. And while hardware acceleration is more readily available in H.264 than VP8, it does pose challenges.
Both are probably at their peak right now when it comes to video calling:
This is the tipping point, where a new video codec is being sought after.
👉 If you are using it today, you should be just fine. If you seriously want to be at the forefront of technology, right on the bleeding edge (and you will bleed - time, money and blood), then read on to your next alternatives.
And if you need to decide between VP8 and H.264, check out this free video course: H.264 or VP8?
It should have been a VP9 vs HEVC thing and not an AV1 vs HEVC thing.
The next best thing in video codec was supposed to be VP9. VP9 is the replacement to HEVC. HEVC is what comes next after H.264, and the intent was always for VP9 to be the alternative to HEVC.
As things go, VP9 advantages are just what you’d expect in a new codec generation:
What VP9 was supposed to bring to the world is SVC - scalability. While VP8 supports temporal scalability, VP9 was touted as a codec that would bring also temporal, spatial and SNR scalability. With VP9 SVC we were supposed to improve resiliency of video as well as the ability to scale large group video calls better than ever before. This never really came to be, as some of these improvements were left out of the official WebRTC APIs until today.
👉 Need a boost and have a very good grasp at who is in a call before everyone joins? VP9 might be a good alternative for you.
I’ve written at length about AV1 when the specification got released. You can learn about AV1 there.
There are those who believe AV1 is ready and have been ready for quite some time. Reality says otherwise. It isn’t for the faint of heart at this point. More on that - below.
👉 Adventurous? Go AV1!
VP9 shipped in Chrome 48 for WebRTC. That was January 2016. 4 years later and it is safe to say that not many are using VP9 in WebRTC.
The two main places where VP9 is making sense?
Once AV1 was announced, the debate began if one should even try and adopt VP9 or wait for AV1 instead. The majority are waiting for AV1. Laziness at its best (and what I would have selected as well if you’re wondering).
The other reason for delaying and skipping a generation is investment in VP9. Since everyone’s looking at AV1, VP9 is left with less eyeballs and developers improving it. Add to that the slow release of SVC support to it in Chrome and the fact that Safari still doesn’t support VP9 and you can understand the reluctance of going this route.
The big Apple is insatiable. Apple has been banking on HEVC for many years now, and where HEVC & WebRTC fits in Apple has been a topic here in the past as well.
On Apple’s release notes for Safari Technology Preview 104 there’s a bullet point that shows where things are headed:
Added initial support for WebRTC HEVC
I wonder whatever for?
To me, this is the biggest conundrum at the moment. A piece of this puzzle is missing. What would make developers use HEVC if it is only available in Safari and nowhere else? This isn’t the app store. It is the web.
Time will tell.
👉 We now know more about HEVC in WebRTC. I wrote about it mere here: WebRTC & HEVC – how can you get these two to work together
I said it before and I’ll iterate it again. AV1 is too new. Too early to be adopted in WebRTC or real time communications. And yet… Google just announced supporting AV1 in Google Duo:
[...] in the coming week, we’re rolling out a new video codec technology to improve video call quality and reliability, even on very low bandwidth connections.
They made sure to add a nice moving GIF so you can see the difference between “a video codec” and AV1 in the same bitrate.
Is that other codec VP8? VP9? H.264? HEVC? Maybe H.261…
Are they using it for all Duo calls? In all devices? In all network conditions?
The only thing I could find is that this rolls out to Android with iOS 2 weeks behind in the roll out. There are more things left unsaid.
We’re all stuck at home burning the networks. The large streaming vendors are lowering resolutions (and bitrates) for their default players in certain countries. This reduces the CPU load, making room for improving quality on lower bitrates. And that leads to the ability (and need) of better video codecs.
Google Duo most probably already makes use of VP9. Maybe even HEVC on iOS devices due to hardware acceleration benefits. When it comes to 1:1 sessions, there's no real reason to stick to a single video codec for all sessions.
With Apple working publicly now on HEVC in WebRTC, it put pressure on Google, and getting AV1 into Duo in order to bolster their side in the AV1 vs HEVC debate became a pressing matter. Google Duo's 1:1 call scenarios were the most suitable candidate for Google to make that stand.
When a new video codec generation was introduced, the thinking was simple: “we are expecting it to support a higher resolution, at a higher bitrate, with a higher CPU consumption”
In 2020, things are changing.
I have 4K resolution on my desktop and laptop. 1080p on my phone and TV. I am happy with 720p content most of the time. I hate fonts on a 4K screen that aren’t enlarged (the damn characters are just too small to read).
What is the value of higher resolution? HDR content? 8K? 360? VR? If all I need is just plain video, no higher resolution is required. We’re all content most of the time with 720p resolutions for business meetings anyway.
Resolution requirements for most content types and use cases are not going to get higher any time soon.
We are probably at peak resolution already.
So we are free to think of next gen video codecs as ones that help consume lower bitrates.
There’s a distinction here. While any new video codec generation consumes lower bitrates for the same resolution/quality, the main purpose of these new video codecs was almost always in increasing the resolution as well.
👉 AV1 on mobile makes perfect sense here. Especially for low resolutions - since we can have some CPU to spare for that scenario.
No. Not officially.
Apple is adding support for it in Safari, but no other browser has added support for it or indicated plans to add support for it
Yes, but not in browsers.
Apple will introduce HEVC in Safari, but no other vendor will. If you build your own native application for either PC or mobile you can add HEVC as another supported codec and use it in your application.
That depends. If you want to add AV1, you need to make sure your use case fits well, as well as the devices you expect your users to have.
You will also need to put a considerable investment of time and money to make it happen.
My suggestion for most vendors would be to wait with AV1 support.
That is a good question with no good answer.
I believe it is a matter of timing. When the time came to adopt VP9, AV1 was already announced and on its way, so vendors preferred to wait and jump directly to AV1 instead of going for VP9.
VP9 doesn't enjoy much hardware acceleration, which also makes it CPU intensive, requiring companies to tweak, fine tune and optimize their systems to use it. That kind of work is something many prefer not to do.
We’re at war again. The video codec war of WebRTC. And this time, each vendor needs to pick a strategy to play.
We’ve got multiple codecs in our warchest: VP8, H.264, VP9, AV1 and sometimes even HEVC now.
Which one will we be using?
Which ones will we be using?
Here, scenarios matter. Different scenarios will call for totally different video codec selection to optimize for quality, CPU use, performance, bitrate, cost, etc.
In 1:1 sessions, you may want to keep your options open - use the best one dynamically just by making a decision as the session is set up.
For group calls, will you be using a single, static video codec? Or allow for multiple ones? Will you have multiple codecs in a single group session? Are you going to have an SFU tweaked and tuned for that? Will you pick the best video codec for a session and then dynamically switch over as the nature of the session changes (=someone joins and leaves who has certain limitations)?
What about consumers? What kind of video codec selection strategies are going to be prevalent there? How are they going to be different than the ones we see in enterprise solutions? What will be the difference for mobile first or application based versus web based solutions?
We live in interesting times.
Codec selection has never been more interesting or important.
While WebRTC offers 2 codecs (H.264 & VP8), most browsers support VP9 and now we’re seeing browser vendors either adding HEVC or using AV1 in their own apps. Audio now faces a similar challenge, with both Microsoft and Google introducing AI-powered voice codecs.
If media quality is at the core of your service (think carefully about your answer to this question), then rethinking your video codec selection strategy might be in order.
It is going to require research and investment. But this is where the future lies for video codecs in WebRTC.
Learn about WebRTC LLM and its applications. Discover how this technology can improve real-time communication using conversational AI.
Read MoreGet your copy of my ebook on the top 7 video quality metrics and KPIs in WebRTC (below).
Read More