OpenAI, LLMs, WebRTC, voice bots and Programmable Video
Learn about WebRTC LLM and its applications. Discover how this technology can improve real-time communication using conversational AI.
Read MoreWebRTC vs Zoom? WebRTC is actually quite good. But you knew that already - didn’t you? :-)
They say quality is in the eye of the beholder. So behold.
We’ve all been told once and again that this video conferencing vendor or that video conferencing vendor work great. They offer the best quality. The best experience. They work in conditions that others don’t.
I even had a call once with an entrepreneur that explained to me how he is going to offer a service that is better in its 1:1 video quality than Skype and Google Hangouts. And he is going to do it with WebRTC. I spent the better part of that call to get him off that idea (something about his logic was off there).
But I am digressing.
As many others, I’ve been told time and again how Zoom is great. How in spite of the fact that it doesn’t work in the browser and forces you to download its client (some even refer to it as a virus), it gets traction and adoption. It feels like it is the best game in town. And then they mention the reasons:
I am not the only one who needs to listen to it, and even believe it to some extent. The guys at Jitsi got curious - why not put it to the test?
So they took a Mac device, placed it on a WiFi network, added a network limiter so they can fiddle with the network configuration, and did a 1:1 call. Once with Zoom. And once with WebRTC.
Idea is this - start with as much bandwidth as the video call wants. Then limit it to 500kbps. Check how much time it takes to adapt. Remove the limit and change how much time it takes it to adapt back. More about it in Jitsi’s blog.
Essentially - testing for this network conditions:
The longer that marked areas, the worse the experience is going to be for the users.
And guess what? Zoom faired worse than WebRTC. Not a little, but a lot worse.
Full adaptation to limiting the bandwidth took WebRTC 20 seconds. It took Zoom 156 seconds (!).
Ramp up back to 2mbps took WebRTC 32 seconds. It took Zoom 62 seconds.
Now here’s my analysis of this.
Yap. it really does.
The screen capture from that Zoom blog post that was pasted by Jitsi?
Stating that “web-RTC is a very limited solution that would not allow us to provide all the excellent features that our users have come to expect from us”?
That’s from 2015.
A lot have been improved in WebRTC since then, if that explanation was even correct in 2015 to begin with.
Without the need for most of us to do anything, we’re getting updates to a top notch media engine in the form of WebRTC inside the browsers we use. The code used in Chrome are open sourced, so they are accessible to all to embed it in their own applications as well.
Security fixes? New codecs? Improved media algorithms? They just “happen”. Out of thin air. For most of us.
If I look at it from Zoom’s point of view, besides the fact of being a dominant player in the market with or without WebRTC, here’s the challenges with such a test scenario:
Which leads us to the fact that more tests are needed to know which one is best and in which scenarios.
This starts to sound like the VP8 vs H.264 quality comparisons of the past (I never could tell the difference).
With WebRTC, it all boils down to the infrastructure. The one with the better deployment wins the quality game.
And the list goes on.
With vendors who use proprietary codecs and transport protocols, this is doubly so, as they need to cater for the browser once they reach WebRTC. So while their native apps might be optimized, it might all go down the drain once they transcode or just “translate” to reach the browser using WebRTC.
Need to understand WebRTC and how to design and architect real world solutions with it? A first step is to understand the servers used to connect WebRTC.
Which brings us to why someone like Zoom should use WebRTC and thing about the quality issues once connecting to it:
UPDATE: This section was updated to reflect what was later found while investigating what Zoom is doing. Head over to webrtcHacks for the full story.
Zoom supports something like WebRTC. I just found out when I searched for stuff to write this article: there’s a Zoom Web Client
It runs on Chrome and enables using audio in Chrome when joining meetings. No video, probably because transcoding the proprietary video codec Zoom uses to the ones in WebRTC is too complicated, but using G.711 or Opus in the browser and transcoding or using the same in Zoom is way simpler.
It tries its best NOT to use WebRTC and still get something working on the browser, which is no easy feat.
I expect Zoom to eventually undergo through the same phases that Amazon did with Chime:
This exact same path has been happening to other vendors in one way or another.
While writing this article, it dawned on me, that this is one of these scenarios that is ridiculously easy to simulate using testRTC, so I went ahead and created a script that does just that:
Here’s how the main part of the script looks like:
// Wait for 1 minute
client
.pause(60*sec)
.rtcScreenshot('ALL GOOD');
if (probeType === 1) {
client
.rtcEvent('Start limit', 'global')
.rtcSetNetworkProfile('custom', 'bandwidth', 500000, 'both', 'both')
}
// 2 minutes with bandwidth limits
client
.pause(60*sec)
.rtcScreenshot('LIMITED')
.pause(60*sec);
if (probeType === 1) {
client
.rtcSetNetworkProfile('') // back to pristine network conditions
.rtcEvent('Stop limit', 'global');
}
client
// 2 more minutes unlimited
.pause(60*sec)
.rtcScreenshot('BACK TO NORMAL')
.pause(60*sec);
The .rtcEvent() calls are there to place a vertical lines on the graphs while the .rtcSetNetworkProfile() is there to fiddle around with the network conditions.
There were two probes here, each one a participant in the call. The first one is the one I limited while the second one was left “untouched”.
Here’s what the graphs look like on the second probe:
The above graph shows the outgoing birate. Within a span of 5 seconds, WebRTC finds out the new effective bitrate and adapts to it. Ramping back up takes some 20 seconds.
The above graph shows the incoming frame rate. You can see how frame rate reporting in WebRTC takes a bit of time to get back to its usual self - also some 20 seconds or so.
I wanted to check how the Jitsi SFU would behave, so I tweaked the test URL for that. The results? Still better than the Zoom one. 20 seconds to hit 30 frames per second and around 50 seconds to get back to full bitrate.
If you want to try it yourself, just import the JSON file in this Google Drive folder to your testRTC account and modify it to fit your needs.
WebRTC is more than good enough.
Making it better is usually about thinking your way through the best possible architecture, along with media servers that take care of network conditions properly.
As for Zoom… please make sure your next call with me is on something that has WebRTC. The machine I regularly use for call is Linux. Zoom doesn’t work there… it doesn’t really support Chrome or Linux. Yet.
Learn about WebRTC LLM and its applications. Discover how this technology can improve real-time communication using conversational AI.
Read MoreGet your copy of my ebook on the top 7 video quality metrics and KPIs in WebRTC (below).
Read More