OpenAI, LLMs, WebRTC, voice bots and Programmable Video
Learn about WebRTC LLM and its applications. Discover how this technology can improve real-time communication using conversational AI.
Read MoreWebRTC open source is a mess. It needs to grow out of its youth and become serious business - or gain serious backing.
This article has been written along with Philipp Hancke. We cooperate on many things - WebRTC courses (new one coming up soon) and WebRTC Insights to name a few.
---
WebRTC is free. Every modern browser incorporates WebRTC today. And the base code that runs in these browsers is open sourced and under a permissive BSD license. In some ways, free and open source were mixed in a slightly toxic combination. One in which developers assume that everything WebRTC should be free.
The end result? The sorry state in which we find ourselves today, 11 years after the announcement of WebRTC. What we’re going to do in this article, is detail the state of the WebRTC open source ecosystem, and why we feel a change is necessary to ensure the healthy growth of WebRTC for years to come.
We’ll start with the most important thing you need to know:
Open Source != Free
Let's take a quick step back before we dive into it though.
An open source project is a piece of source code that is publicly available for anyone under one of the many open source licenses out there. Someone, or a group of people from the same company or from disparate places, have “banded together” and created a piece of software that does something. They put the code of that software out in the open and slap a license on top of it. That ends up being an open source project.
Open source isn’t free. There’s a legal binding associated with using open source, but it isn’t what we’re interested in here. It is the fact that if you use open source, it doesn’t mean that you pay nothing to no one. It just means that you get *something* with no strings attached.
Why would anyone end up doing this for free? Well… that brings us to business models.
There are different types of open source licenses. Each with its own set of rules, and some more permissive than others, making them business-friendly. Sometimes the license type itself is used as a business model, simply by offering a dual license mode where a non-permissive open source license is available freely and a commercial one is available in parallel.
In other cases, the business model of the open source project revolves around offering support, maintenance and customization of that project. You get the code for free, but if you want help with it - you can pay!
Sometimes, the business model is around additional components (this is where you will see things like community edition and enterprise edition popping up as options in the project’s website). Things such as scripts for scaling the system, monitoring modules or other pieces of operational and functional components are protected as commercial products. The open source part brings companies to use it and raise popularity and awareness to the project, while the commercial one is the reason for doing it all. How the developers behind the project bring food to the table and become rich.
In recent years, you see business models revolving around managed services. The database is open source and free, but if you let us host it for you and pay for it, we’ll take care of all your maintenance and scaling headaches.
And some believe it is really and truly free. Troy Hunt wrote about it recently (it is a really good post - go read it):
“... there is a suggestion that those of us who create software and services must somehow be in it for the money”
To that I say - yes!
At the end of the day, delving into open source is all about the money.
Why?
The moment the open source project you are developing is meaningful to two more people, or even a single company, there are monetary benefits to be gleaned. We’d venture that if you aren’t making anything from these benefits (even minor ones), then the open source project has no real future. It gets to a point where it should either grow up or wither and die.
Just a few things before we start our journey to the WebRTC open source realm:
A common mistake by “noobs” is that WebRTC is a solution that requires no coding. Since browsers already implement it, there’s nothing left to do. This can’t be farther away from the truth.
WebRTC as a protocol requires a set of moving parts, clients and servers; that together enable the rich set of communication solutions we’re seeing out there.
The diagram above, taken from the Advanced WebRTC Architecture course, shows the various components necessary in a typical WebRTC application:
For each and every component here, you can find one or more open source projects that you can use to implement it. Some are better than others. Many are long forgotten and decaying. A few are pure gold.
Lets dive into each of these components to see what’s available and at what state we find the open source community for them.
First and foremost, we have the WebRTC open source client libraries. These are implementations of the WebRTC protocol from a user/device/client perspective. Consider these your low level API for WebRTC.
There used to be only a single one - libwebrtc - but with time, more were introduced and took their place in the ecosystem. Which is why we will start with libwebrtc:
THE main open source project of WebRTC is libwebrtc.
Why?
Practically speaking - libwebrtc is everywhere WebRTC is.
Here are a few things you need to know about this library:
Looking at the contributions over time Google is doing more than 90% of the work:
The amount of changes has been decreasing year-over-year after peaking in early 2016. During the pandemic we even reached a low point with less than 200 commits per month on average. Even with these reduced numbers libwebrtc is the largest and most frequently updated project in the open source WebRTC ecosystem.
The number of external contributions is fairly low, below 10%. This doesn’t bode well for the future of libwebrtc as the industry’s standard library of WebRTC. It would be better if Google opened up a bit more for contributions that improve WebRTC or those that make it easier to use by others.
This leads us to the business model aspect of libwebrtc 👇
💰 Money time
What if one decides to use libwebrtc and integrate it directly in his own application?
That said, for the most part, and in most situations, libwebrtc is the best alternative - that’s because it follows the exact implementations you will be bumping into in web browsers. It will always be the most up to date one available.
A side note - libwebrtc is implemented in C++. Why is this relevant? Pion 👇
Pion is a Go implementation of the WebRTC APIs. Sean DuBois is the heart and sole behind the Pion project and his enthusiasm about it is infectious.
Putting on Tsahi’s cynic hat, Pion’s success can be attributed a lot to it being written in Go. And that’s simply because many developers would rather use Go (modern, new, hip) and not touch C++.
Whatever the reason is, Pion has grown quite nicely since its inception and is now quite a popular WebRTC open source project. It is used in embedded devices, cloud based video rendering and recently even SFU and other media server implementations.
💰 Money time
What if one decides to use Pion and integrate it directly in his own application?
There are other implementations of WebRTC in other languages.
The most notable ones:
There are probably others, less known.
We won’t be doing any 💰 Money time section here. These projects are still too small. We haven’t seen too many services using them in production and at scale.
GStreamer is an open source media framework that is older than WebRTC. It is used in many applications and services that use WebRTC, even without using its WebRTC capabilities (mainly since these were added later to GStreamer).
We see GStreamer used by vendors when they need to transform video content in real-time. Things like:
Since WebRTC was added as another output type in GStreamer, developers can use it directly as a broadcasting entity - one that doesn’t consume data but rather generates it.
GStreamer is a community effort and written in C. While it is used in many applications (commercial and otherwise), it lacks a robust commercial model. What does that mean?
💰 Money time
What if one decides to use GStreamer and integrate it directly in his own application?
Next we have open source TURN servers. And here, life is “simple”. We’re mostly talking about coturn. There are a few other alternatives, but coturn is by far the most popular TURN server today (open source or otherwise).
In many ways, we don’t need more than that, because TURN is simple and a commodity when it comes to the code implementation itself (up to a point, as Cloudflare is or was❓ trying to change that with their managed service).
But, and there’s always a but in these things, coturn needs to get updated and improved as well. Here’s a recent discussion posted as an issue on coturn’s github repo:
Is the project dead?
Read the whole thread there. It is interesting.
The maintainers of coturn are burned out, or just don’t have time for it (=they have a day job). For such a popular project, the end result was a volunteer or two from the industry picking up the torch and doing this in parallel to their own day job.
Which leads us to:
💰 Money time
What if one decides to use coturn and integrate it directly in his own application?
Signaling servers are a different beast. WebRTC doesn’t define them exactly, but they are needed to pass the SDP messages and other signals between participants. There are several alternatives here when it comes to open source signaling solutions for WebRTC.
It should be noted that many of the signaling server alternatives in WebRTC offer purely peer communication capabilities, without the ability to interact with media servers. Some signaling servers will also process audio and video streams. How much they focus on the media side versus the signaling side will decide if we will be treating them here as signaling servers or media servers - it all boils down to their own focus and to the functions they end up offering.
Signaling requires two components - a signaling server and a client side library (usually lightweight, but not always).
We will start with the standardized ones - SIP & XMPP.
SIP and XMPP preceded WebRTC by a decade or so. They have their own ecosystem of open source projects, vendors and developers. They act as mature and scalable signaling servers, sometimes with extensions to support WebRTC-specific use-cases like creating authentication tokens for TURN servers.
We will not spend time explaining the alternatives here because of this.
👉 Here, it is worthwhile mentioning MQTT as well. Facebook is known to be using it (at least in the past - not sure about today) in their Facebook Messenger for signaling
PeerJS has been around for almost as long as WebRTC itself. For an extended period of that time, the codebase has not been maintained or updated to fit what browsers supported. Today, it seems to be kept.
The project seems to focus on a monolithic single server deployment, without any thought about horizontal scaling. For most, this should be enough.
Throughout the years, PeerJS has changed hands and maintainers, including earlier this year:
Without much ado, lets move to the beef of it:
💰 Money time
What if one decides to use PeerJS and integrate it directly in his own application?
Simple-Peer has been driven by Feross and his name in the early days. It is another one of those “pure WebRTC” libraries that focuses solely on peer-to-peer. If that fits your use-case, great, it is mature and “done”. Most of the time your use-case will evolve over time though.
It has received only a few maintenance commits in 2022 and not many more in 2021. The same considerations as for PeerJS apply for simple-peer. If you need to pick between the two… go for simple-peer, the code is a bit more idiomatic Javascript.
💰 Money time
Just go read PeerJS - same rules apply here as well.
Matrix is “an open network for secure, decentralized communication”. There’s also an open standard to it as well as a commercial vendor behind it (Element).
Matrix is trying to fix SIP and XMPP by being newer and more modern. But the main benefit of Matrix is that it comes as client and server along with implementations that are close to what Slack does - network and UI included. It is also built with scale in mind, with a decentralized architecture and implementation.
Here we’re a bit unaligned… Tsahi thinks Matrix is a good alternative and choice while Philipp is… less thrilled. Their WebRTC story is a bit convoluted for some, meandering from full mesh to Jitsi to a “native SFU” only recently.
So… Matrix has a company behind it. But they have their own focus (messaging service competing with Slack with privacy in mind).
💰 Money time
What if one decides to use Matrix and integrate it directly in his own application?
At the time of writing, there are 26,121 repositories on github mentioning WebRTC. By the time you’ll be reading it, that number will grow some.
Not many are sticking out too much, and in that jumble, it is hard to figure out which projects are right for you. Especially if what you need needs to last. And doubly so if you’re looking for something that has decent enough support and a thriving community around it.
Another set of important open source WebRTC components are media servers and SFUs.
While signaling servers deal with peer communication of setting up the actual sessions, media servers are focused on the channels - the actual data that we want to be sending - audio and video streams, offering realtime video streaming and processing 👉 Whenever you’ll be needing group sessions, broadcasts or recordings (and you will, assuming you’d like video calls or video conferences incorporated in your application), you will end up with media servers.
Here’s where are are marketwise 👇
I’ve written about these projects at length in my 2022 WebRTC trends article. Here’s a visual refresher of the relevant part of it:
Janus, Jitsi, mediasoup and Pion are all useful and popular in commercial solutions. Let’s try to analyze them with the same prism we did for the other WebRTC open source projects here.
Jitsi can be considered a platform of its own:
💰 Money time
-
To be clear - in all cases above, getting vendors to help you out who aren’t maintaining the specific media server codebase means results are going to be variable when it comes to the quality of the implementation. In other words, it is hard to figure out who to work with.
The Kurento Media Server is dead. So much so that even the guys behind it went to build OpenVidu (below) and then made OpenVidu work on top of mediasoup.
Don’t touch it with a long stick.
It has been dead for years and from time to time people still try using it. Go figure.
A higher layer abstraction open source project strives to become a platform of sorts. Their main focus in the WebRTC ecosystem is to offer a layer of tooling on top of open source media servers. The two most notable ones are probably OpenVidu and LiveKit.video conferencing
OpenVidu is a kind of an abstraction layer to implement a room service, UI included.
It originates from the team left behind from the Kurento acquisition. With time, they even adopted mediasoup as the media server they are using, putting Kurento aside for the most part.
💰 Money time
Unlike many of the open source solutions we’ve seen so far, OpenVidu actually seem like they have a business model:
LiveKit offers an “open source WebRTC infrastructure” - the management layer above Pion SFU.
For the life of me though, I don’t understand what the business model is for LiveKit. They are a company - not just an open source project, and as such, they need to have revenue to survive.
Most probably they get some support and development money from enterprises adopting LiveKit, but that isn’t easily apparent from their website.
There are other companies who offer commercial solutions that are proprietary in nature. Some do it as on premise alternatives, where they provide the software and the support, while you need to deploy and maintain.
These can either be suitable solutions or disasters waiting to happen. Especially when such a vendor decides to pivot or leave the market.
Tread carefully here.
This has been a long overview, but I think we can all agree.
The current state of WebRTC open source is abysmal:
If it were up to us, and it isn’t, we’d like to see a more sophisticated market out there. One that gives more and better commercial solutions for enterprises and entrepreneurs alike.
Learn about WebRTC LLM and its applications. Discover how this technology can improve real-time communication using conversational AI.
Read MoreGet your copy of my ebook on the top 7 video quality metrics and KPIs in WebRTC (below).
Read More