OpenAI, LLMs, WebRTC, voice bots and Programmable Video
Learn about WebRTC LLM and its applications. Discover how this technology can improve real-time communication using conversational AI.
Read MoreIn group calls there are different ways to decide on WebRTC server allocation. Here are some of them, along with recommendations of when to use what.
In WebRTC group calling, media server scaling is one of the biggest challenges. There are multiple scaling architectures that are used, and most likely, you will be aiming at a routing alternative, where media servers are used to route media streams around between the various participants of a session.
As your service grows, you will need to deal with scale:
In all these instances, you will have to deal with the following challenge: How do you decide on which server to allocate a new user? There are various allocation schemes to choose from for WebRTC group calling. Each with its own advantages and challenges. Below, I’ll highlight a few such schemes to help you with implementing the WebRTC allocation scheme that is most suitable for your application.
First things first. Media servers in WebRTC don’t scale well. For most use cases, a single server will be able to support 200-500 users. When more than these numbers are supported, it will usually be due to the fact that it sends lower bitrates by design, supports only voice or built to handle only one way live streaming scenarios.
This can be viewed as a bad thing, but in some ways, it isn’t all bad - with cloud architectures, it is preferable to keep the blast radius of failures smaller, so that an erroneous machine ends up affecting less users and sessions. WebRTC media servers force developers to handle scaling earlier in their development.
Our first order of the day is usually going to be deciding how to deal with more than a single media server in the same data center location. We are likely to load-balance these media servers through our signaling server policy, effectively associating a media server to a user or a media stream when the user joins a session. Here are a few alternatives to making this decision.
This one is rather straightforward. We fill out a media server to capacity before moving on to fill out the next one.
Advantages:
Challenges:
In this technique, we look for the media server that has the most free capacity on it and place the new user or session on it.
Advantages:
Challenges:
Our “don’t think too much” approach. Allocate the next user or session to a server and move on to the next one in the list of servers for the next allocation.
Advantages:
Challenges:
Then there’s the approach of picking up a server by random. It sounds reckless, but in many cases, it can be just as useful as least used or round robin.
Advantages:
Challenges:
The second part is determining which region to send a session or a user in a session to.
If you plan on designing your service around a single media server handling the whole session, then the challenge is going to be where to open a brand new session (adding more users takes place on that same server anyway). Today, many services are moving away from the single server approach to a more distributed architecture.
Lets see what our options are here in general.
The first user in a session decides in which region and data center it gets created. If there are more than a single media server in that data center, then we go with our single data center allocation techniques to determine which one to use.
This is the most straightforward and naive approach, making it almost the default solution many start with.
Advantages:
Challenges:
👉 Note that everything has a solution. The solutions though makes this harder to implement and may degrade the user experience in the edge cases it deals with.
You can pick the first that joins the room to make the decision of geolocation or you can use other means to do that. Here, the intent is to use something you know in your application in advance to make the decision.
For example, if this is a course lesson with the teacher joining from India and all the students are joining from the UK, it might be beneficial to connect everyone to a media server in the UK or vice versa - depending on where you want to put the focus.
A similar approach is to have the session determine the location by the host (similar to first in room) or be the configuration of the host - at account creation or at session creation.
Advantages:
Challenges:
Cascading is also viewed as distributed/mesh media servers architecture - pick the name you want for it.
With cascading, we let media servers communicate with each other to cater for a single session together. This approach is how modern services scale or increase media quality - in many ways, many of the other schemes here are “baked” into this one. Here are a few techniques that are applicable here:
Advantages:
Challenges:
This one surprised me the first time I saw it. In this approach, we “disconnect” all incoming traffic from outgoing and treat each of them separately as if it were an independent live stream.
What does that mean? When a user joins, he will always connect to the media server closest to them in order to send their media. For the incoming media from other users, he will subscribe to their streams directly on the media servers of those users.
Advantages:
Challenges:
One thing I ignored in all this is how do you know when a server is “full”. This decision can be done in multiple ways, and I’ve seen different vendors take different approaches here. There are two competing aspects here to deal with:
Here are a few examples, so you can make an informed decision on your end:
Sometimes, we will use multiple metrics to make our allocation decision.
Scaling group calls isn’t simple once you dive into the details. There are quite a few WebRTC allocation schemes that you can use to decide where to place new users joining group sessions. There are various techniques to implement allocation of users in group calling, each with its own advantages and challenges.
Pick your poison 🧪
👉 One last word - this article was written based on a new lesson that was just added to the Advanced WebRTC Architecture course. If you are looking for the best WebRTC training, then check out my WebRTC Courses.
Learn about WebRTC LLM and its applications. Discover how this technology can improve real-time communication using conversational AI.
Read MoreGet your copy of my ebook on the top 7 video quality metrics and KPIs in WebRTC (below).
Read More