What is the Difference Between a Signaling Protocol and a Transport Protocol?

December 4, 2014

Time for me to sort this one out, as I am the one assisting in spreading this misunderstanding.

I adopted a slide in the past and used it in many occasions to explain the different signaling options available to developers. Here's one of its variants:

The problem, which Justin Uberti once pointed out, is that I have made a mistake by combining signaling and transport protocols together. Which I shouldn't. And I agree with Justin here. Consider this my attempt at ratifying this mistake, and the reason is a question on Quora I bumped into lately. A person was asking about the difference between XMPP and BOSH.

The easy answer to this question, is that XMPP is a signaling protocol and BOSH is a transport protocol. Here. We're done.

We have low level transport protocols already. They call them TCP and UDP. What these protocols do is allow sending arbitrary date from one point in the network to another. Not many assumptions are made about the data being sent, and it is assumed that some application on top will try to make sense of that data. This part is out of scope.

In our browsers, transport protocols that allow sending arbitrary data from both the browser to the web server and vice versa include XHR, SSE and Websocket. If what you are trying to achieve is sending arbitrary data then you'd pick one of these transport protocols.

Signaling protocols go one step higher. I have this need. I want to be able to express some mechanism - a way to tell the other end something. In our case it can be the need to open a call, my availability, my identification. To that end, I can either invent a protocol to do that or use a predefined protocol - something that people have already agreed upon in the past. This protocol is a signaling protocol.

The predefined ones? H.323, SIP and XMPP. There are more, but these are the main ones used in VoIP and instant messaging.

The SIP signaling protocol uses TCP or UDP for its transport. For WebRTC, there is an adaptation of SIP over Websocket.

XMPP uses BOSH most of the time as its transport when used inside web browsers. BOSH is 2 separate HTTP connections to a server, one used for outgoing messages and the other for incoming ones. XMPP can also use Websocket as transport.

If you don't want to learn, don't care about or have no real need for SIP and XMPP, you can forgo them altogether and invent your own proprietary protocol. You can run it on whatever transport you see fit. It can even be a combination of several transport protocols.

Why is this important?

In the not so distant past, we were led to believe that there must be a standardized signaling protocol that everyone uses.

This concept has been broken. While it has its value for many use cases, it holds no value for some use cases and in many cases, there is no business value in adopting a standardized signaling protocol.

There are many business reasons why this came to be. The technical reasons that enabled that?

  1. The adoption of modern transport protocols in web browsers (mainly Websocket) and the wide use of it in the web
  2. The adoption of a media engine with a standardized API (call it WebRTC)

Together these two made it easier than ever to just use whatever protocol necessary, relegating the whole idea of communications from a standalone service to a feature in another service with its own signaling needs.


You may also like