Anycast enables WebRTC services to better manage and optimize global deployments at scale.
In 2021 we’ve started seeing a new technology finding its way more and more into WebRTC applications: Anycast. Unlike other shiny new toys, Anycast isn’t shiny and it isn’t new. In fact, it has been defined in the previous millenia, before the era of the smartphone.
I’ve been “doing” VoIP for over 20 years now, but wasn’t really aware of Anycast. I dug a bit around, and ended up sitting with William King, CTO & Co-founder of Subspace, to learn more about AnyCast and its use with WebRTC.
Here’s what I learned about how WebRTC developers can and are using Anycast – and how it can assist them in their own deployments.
Table of contents
What is Anycast anyway?
For someone sitting in the clouds today, the lowest level of networking you can think of is the IP level (I am told there are lower levels, but for me IP is low enough).
At that level, if one machine wants to reach another, it needs to use its IP address as the destination. In most cases, and at least in 99% of all of the things I’ve implemented myself as a developer, you do this using what is known as Unicast:
With Unicast, each device on the network has its own unique IP address that I can use to reach it directly (and yes, I am ignoring here the distinction between local networks and public networks and how they handle it). The key thing here is that an IP address is associated with one device only, so as the illustration above shows, when the red device wants to send a message to the green device, it can send it to him via Unicast simply by stating the green device’s IP address as the destination.
Anycast is different. With Anycast, multiple devices on the network can have the same IP address associated with them. The end result is more akin to this:
In the illustration above we have 3 different green devices with the same IP address. When the red device wants to send a message to their IP address, it doesn’t really know which one will be receiving his message – just that it is somehow going to be routed to one of them. Which one? The “closest” one usually, whatever that means.
What does that mean exactly?
Here’s how Wikipedia explains it (the illustrations above are rough sketches I did based on the ones I found on their page explaining Anycast):
Anycast is a network addressing and routing methodology in which a single destination IP address is shared by devices (generally servers) in multiple locations. Routers direct packets addressed to this destination to the location nearest the sender, using their normal decision-making algorithms, typically the lowest number of BGP network hops. Anycast routing is widely used by content delivery networks such as web and DNS hosts, to bring their content closer to end users.
Lets emphasize this with colors, so we focus on the important bits –
- We get a single IP address that can be shared between multiple devices in different locations
- When we send a message to that IP address, it will get routed to the nearest device
- The decision is done lower in the network layers
- It is popular with CDNs and DNS hosts
Anycast is something that is being widely used today, just not in VoIP or WebRTC.
The main purpose of Anycast at the end of the day is to provide high availability for stateless services.
- Why high availability? Because we have multiple devices with the same IP address. If one goes down, messages get routed to other servers. Magically.
- For stateless services? Since we don’t know and can’t guarantee which device each message is going to be routed to, it is simpler to use it for stateless services.
Challenges of using Anycast in WebRTC
The best thing you can do with Anycast is to deal with single request-response pairs – stateless.
Why? You send out your request (for example to translate a DNS name to an IP address; or for that next chunk of a Netflix episode you’re watching), and the server (device) you reach on the network sends you that response.
Looking for the next chunk in the Netflix episode or need another DNS name translation? Easy – send another request, and the same or another server with the same Anycast IP address will respond.
Enter WebRTC.
A world where everything and anything is stateful.
There’s signaling. With its connection state machine, ICE negotiation state machine (see? State Machine hints of this not being stateless) and application logic on top.
Then there are TURN servers and media servers. All of them need to understand the state and manage incoming media flow that is both stateful and real time.
This makes utilizing Anycast in WebRTC quite a challenge.
While we’d like to enjoy Anycast’s obvious advantage of high availability (and a few other advantages it gives), in order to do so, we need to overcome the statefulness challenge first.
The simplest link in WebRTC is the TURN server. While stateful, its job is rather simple – routing data between peers without much thought. This makes TURN servers the best candidate for infrastructure optimizations using Anycast.
Lets see what advantages Anycast TURN infrastructure can give WebRTC applications.
3 advantages of Anycast for WebRTC
Once you get down to it, deploying TURN servers and maybe even media servers using Anycast can give some interesting benefits to your infrastructure.
Here are the main advantages – ones that are going to define how WebRTC infrastructure will be designed and deployed in the coming years.
#1 – Better geolocation
When a user connects your WebRTC application, your best bet is to make sure the user is as close to your infrastructure as possible. The fastest you put him on a TURN or a media server, the better media quality you can expect.
Why? Simple. Because from that server the user connected to – you control and own the media flow of the server. And if you control and own it you can make it better. But that part of the journey the media does from the user to your first server? That’s something you don’t control and own so your ability to improve quality there is lower.
This is why whenever a user joins, you are likely to start doing some geolocation, trying to figure out where the user is coming from in order to allocate for him your “closest” TURN or media server.
That process is done usually by looking at the origin IP address and then using a third party service to indicate the location of that IP address – or by DNS geolocation – letting a DNS server do that for us somehow. When we leave it to the DNS, then we are at the mercy of the DNS service hosting. It works, but not always. And it is also somewhat slow to update.
Remember that time you changed the DNS configuration of your Wordpress server? Were you told it can take a few hours to “propagate”? Well… that’s exactly the problem you might be facing in getting routes updated when using DNS geolocation.
With Anycast, geolocation takes place at the BGP level. Don’t ask me what that is exactly, but it means two things for us:
- Changes and updates propagate faster. I was told by Subspace that their network fully updates within 30 seconds of a change taking place
- You (our the one providing you WebRTC servers with Anycast) are in control and ownership of these routes and their optimization.
That second point is a big difference. DNS servers have different “job to be done” than WebRTC Anycast services. The latter focuses on real time delivery and on better and more optimized geolocation as an extension of it. So you can expect better results overall, especially on a global scale.
#2 – Higher resiliency (and security)
To operate an Anycast service requires solving the statelessness challenge it when it comes to WebRTC. Once that is solved, we gain the benefit of having our data routed through the closest server over the IP layer.
If the physical server we’re working in front of goes down, then Anycast will reroute future traffic through other servers with the same IP address. And that gives us a natural resiliency.
Furthermore, assume I am an “adversary” that wants to take down your service or disrupt it.
I can check the IP addresses you are using and map your servers. I can then commence with a DDoS attack to flood one or more of your servers via these IP addresses.
If that IP address belongs to a specific server, it will require a relatively small amount of traffic to bring that server down to its knees. But if that IP address belongs to multiple servers via Anycast, then flooding that IP address means trying to flood the whole network and not a specific server – a much harder task to achieve.
Resiliency comes built-in with Anycast.
#3 – Ease of configuration
The ease of configuration is something you get from the first two advantages.
Once we’re using Anycast, then there are a few things that make our lives easier:
- The whole GeoDNS operations we’re doing is done on a lower level for us via Anycast, and the higher application layers can remain uninvolved
- If a routing change is needed, then the change takes effect a lot faster, giving us better feedback loops of the changes we’re making
- With a single IP address we can have less addresses given to customers who need to configure their firewalls accordingly – our list of IPs are simpler and shorter
- Since we are more resilient by design, then decommissioning servers, upgrading them, replacing them or whatever – is easier to deal with, since existing traffic is less affected
Is Anycast in the future of WebRTC?
Anycast is where much of the future of WebRTC services lies.
We are shifting our focus on how to optimize and maintain WebRTC infrastructure at scale. Last year it was all about getting to that 49-grid gallery view. This year it is a lot more nuanced. It is mostly about scale, performance and global reach as far as I can tell.
Anycast can play a vital role in that area and in how services can improve their performance and perceived quality for their users.
Great article. Thank you!
One comment: it seems to me that Anycast at the same time as it has resilience inside, it could be a soil for sabotage as well. What if someone set the same IP address of another server to his own and hijack the transmission of a package, disrupting the flow of communication? May be I am being naiv here and don't understand fully the mechanism of routers resolutions for traffic.
Not my expertise Carlos, but assuming what you say is correct, then DNS and CDN traffic are both prone to the same hijack issues, and I don’t believe that that is the case.
Interesting article. I suppose if connection between two peers is direct, i.e. without intermediate TURN or Media Server, then does Anycast have any advantage? Perhaps for the initial signalling connection using signalling server?
Jernej,
The answer I am usually getting to such a question is that "it depends". Sometimes the use of Anycast would be preferable and it is highly dependent on the networks each of the participants are on versus where the cloud servers are.