Comments on: Matrix.org and WebRTC: An Interview with Matthew Hodgson

By: Tsahi Levent-Levi

Tsahi Levent-Levi — Wed, 12 Jun 2019 14:52:18 +0000

In reply to Lennie. It might. I stopped doing these interviews though. I'll probably be incorporating a recorded video interview in my WebRTC course at some point for Matrix.

By: Lennie

Lennie — Wed, 12 Jun 2019 09:38:35 +0000

The matrix.org specification has reached 1.0 and they say it’s production ready.

Could this be a good moment for a new interview ?

By: Matthew Hodgson

Matthew Hodgson — Tue, 18 Nov 2014 04:44:18 +0000

In reply to Matthew Hodgson.

Randomly it doesn’t look like i can reply more than 5 deep, so continuing the thread here…

> Tsahi will regret telling us we can take over his comments. 🙂

I am sure he’s loving the debate 😀

> * So Hazelcast – while actually performing surprisingly well in Openfire – really isn’t suited to use over links of that quality. I’ve seen FMUC running over poor quality links, though, and I believe Isode have run clustering over the same.

Right, interesting. When I came across FMUC a few months ago I got the impression it didn’t exist much outside Isode’s commercial implementation. I maintain that normal XEP-45 MUCs still problematic though – hence the need for FMUC (or Matrix).

> * Yes, you do indeed want data stored securely, and in a service of your choice. I see this as a strong argument *for* XMPP, though.

Well, XMPP and Matrix share the same architecture here: you know precisely where your data is in a client/server model – there’s no magical DHT or similar p2p cloud.

> * I agree with your sentiment, but I suspect that after everything else has been considered, the fact it’s XMPP over WebSocket rather than HTTP isn’t a big deal. There was a discussion… erm… late last year, I think, about doing XMPP on EPROM-grade embedded kit. It surprised me that it’s possible.

I’m sure it’s possible, but is it actually compelling relative to COAP or MQTT?

> A chunk of the IoT crowd seems to think XMPP is the right choice there, too.

I can see XMPP being good for the federated messaging ‘backbone’ (and similarly Matrix), but i’m not sure i’d want some super-lowpower super-cheap low-bandwidth device having to speak XMPP (or HTTP for that matter…)

> * E2E is trivial, these days, in terms of the crypto architecture.

In terms of the crypto theory, yes. Implementations that don’t scare off client developers seem a bit rarer 🙂

> The problem is that if I want to encrypt my traffic to you, I’ve got to find your key, and in general terms that means I need to find something I trust that can tell me your key. When you consider that PH-B’s “encode the key fingerprint into the email address” is currently one of the more user-friendly mechanisms we have on the table, it begins to become more understandable why this is such a tough area to crack.

Hm, sorry for ignorance, but what’s a PH-B (other than a pointy haired boss)?

> An option might be to have an issuer lookup running over a DANE-like DNSSEC mechanism, but if you’re using a centralized flat ID space I’m not sure how that works.

Well, our ID space isn’t flat – matrix IDs are per-domain, and we could absolutely do a DANE style model. But instead we’re looking more at going down a POSH style route, using well known URIs to let domains host their own keyservers, falling back to a central point of control if all else fails. In fact we’ve been looking at how Open Peer handles the same problem and trying to converge there.

> * So you’re assuming that there is a single point of trust which presumably you operate. And if those ID servers are compromised, the entire network is compromised, isn’t it? I think you’re braver than I am taking on that responsibility. It solves lots of problems, like key-lookup, etc – but everyone on the entire network must trust you entirely

Right now we have a single logical cluster of identity servers (there are two currently; one run by matrix.org and another by the developer who wrote it), and they store all ID mappings and replicate globally across the cluster. Right now the public keys (certs) themselves are actually served up by their own server – like SSH, the first time you connect you trust the server and cache the key to spot future changes.

However, the beauty here is that the key distribution and discovery mechanism is *entirely* decoupled from the rest of Matrix – and so especially at this early stage we’re quite happy to add new key distribution mechanisms and deprecate bad ones. Hence looking at how OpenPeer does it – and now POSH and DANE too. Personally I particularly dislike the current single logical ID server model, given almost everything else in Matrix is decentralised (apart from the server where your account lives, but we’re looking at fixing that too).

Eitherway, there is definitely room for refinement in the identity system, but at least the current initial version works. Watch this space to see how it evolves!

By: Dave Cridland

Dave Cridland — Mon, 17 Nov 2014 11:11:56 +0000

In reply to Matthew Hodgson.

Tsahi will regret telling us we can take over his comments. 🙂

* So Hazelcast – while actually performing surprisingly well in Openfire – really isn’t suited to use over links of that quality. I’ve seen FMUC running over poor quality links, though, and I believe Isode have run clustering over the same. So yeah, it’s possible, and works. XMPP servers can cluster internally using eventual consistency (potentially for everything, given enough care), leaving the more structured messaging purely for external connections.

* Yes, you do indeed want data stored securely, and in a service of your choice. I see this as a strong argument *for* XMPP, though.

* I agree with your sentiment, but I suspect that after everything else has been considered, the fact it’s XMPP over WebSocket rather than HTTP isn’t a big deal. There was a discussion… erm… late last year, I think, about doing XMPP on EPROM-grade embedded kit. It surprised me that it’s possible. A chunk of the IoT crowd seems to think XMPP is the right choice there, too.

* E2E is trivial, these days, in terms of the crypto architecture. The problem is that if I want to encrypt my traffic to you, I’ve got to find your key, and in general terms that means I need to find something I trust that can tell me your key. When you consider that PH-B’s “encode the key fingerprint into the email address” is currently one of the more user-friendly mechanisms we have on the table, it begins to become more understandable why this is such a tough area to crack. An option might be to have an issuer lookup running over a DANE-like DNSSEC mechanism, but if you’re using a centralized flat ID space I’m not sure how that works.

* So you’re assuming that there is a single point of trust which presumably you operate. And if those ID servers are compromised, the entire network is compromised, isn’t it? I think you’re braver than I am taking on that responsibility. It solves lots of problems, like key-lookup, etc – but everyone on the entire network must trust you entirely.

By: Matthew Hodgson

Matthew Hodgson — Sun, 16 Nov 2014 22:45:30 +0000

In reply to Dave Cridland.

Thanks for continuing the discussion and the constructive responses! In the spirit of taking over Tsahi’s comment section then (as sad as it is to throw info into a distinctly non-federated silo like a blog…):

* In terms of whether MUCs can federate – I’m aware of Kevin Smith’s XEP-0289, however I don’t think this is compatible with ‘normal’ MUCs, and also isn’t mainstream yet. Ironically FMUC looks fairly similar to Matrix architecturewise, although we only discovered it after publishing Matrix. As far as I understood, normal MUCs don’t federate at all – unless the implementation happens to try to support clustering of a single logical MUC over multiple physical servers, like OpenFire’s Hazelcast-based plugin. I may be missing something, though, as it seems unlikely anyone would ever run a hazelcast cluster split over airplane wifi… 🙂

* In terms of caring about what happens to data serverside… well, as a user I want to make sure I have synchronised message history guaranteed on all devices, and it’s even nicer if my data is represented in some kind of redundant manner serverside so that i may recover it from the wider network in case of disaster. Plus I want to be able to pick what service I use to store it.

* In terms of “why not just run a XMPP/SIP/whatever stack clientside in the browser”… well, it’s just engineering hygiene. I don’t want to have to select/write/maintain/expend-CPU/bandwidth/memory on unnecessary dependencies if I can get away with just firing off an HTTP (or COAP or MQTT) request. This is particularly true on constrained devices.

* Totally agreed that modular specs are a Good Thing – and that Matrix is just a module on top of the web. It’s just a question of how kitchen-sink to go. Matrix is unashamedly quite kitchen-sink, in terms of trying to provide a baseline of common functionality over as many current VoIP/IM apps as possible. But it’s also extensible beyond that. Obviously we’re interested to see whether we’ve set the baseline at the right height 🙂

* End-to-end crypto is indeed hard, but it’s looking pretty positive – we already have PKI crypto-signing running (and mandatory) for federation traffic at the application level, in addition to TLS at the transport layer. Extending this to support actual end-to-end crypto should be fairly straightforward, and should land in the next month or so.

* I’ve personally never developed against Jingle – and I didn’t realise Hangouts still used it. I maintain that it’s not taken over the world for open federated VoIP though, alas…

* Using 3PIDs (3rd party IDs) for ID is how Matrix works, and hopefully doesn’t require infinite trust – instead we currently ringfence the ID servers that map 3PIDs to IDs to a trusted clique (a bit like DNS root servers). This may well evolve in future, but as long as the messaging fabric is decoupled from the identity problem, i hope we can avoid infinite trust 🙂

Finally, with respect to messaging-passing v. replicated datastructures.. yes, they are two sides of the same coin. Matrix isn’t a DHT (unlike Telehash, Tox & friends), but in the end it’s passing messages to synchronise datastructures rather than simpler stanza-like events. Whether this is a good idea or not remains to be seen. The assumption is that only server-implementors will ever speak the federation protocol, so it doesn’t matter too much that it’s a bit complicated.

In terms of Wave… yes, featurewise Matrix is quite heavily inspired by Wave (albeit without the XMPP, obviously). I don’t think anyone’s actually compared the protocols though – would certainly be good to look and learn.

By: Tsahi Levent-Levi

Tsahi Levent-Levi — Sun, 16 Nov 2014 12:35:15 +0000

In reply to Dave Cridland.

By all means guys – take over my comments section 🙂

If either one of you wants to write a guest post, I’ll be happy to share it here on my blog as well.

By: Dave Cridland

Dave Cridland — Sun, 16 Nov 2014 10:40:56 +0000

In reply to Matthew Hodgson.

Thanks for your response; the majority of nay-sayers tend to limit the descriptions of their problems to vague issues without much definition, but you’ve given some well-reasoned answers.

In order:

* If a MUC is hosted on a single domain, and a single server, then even a single network link dropping can cause outage. Luckily, a domain can be hosted across multiple servers, and chatrooms can even be federated. I’ve seen this work over satellite links to aircraft – it really does work. I think the bigger problem is that the concepts of group conversation in XMPP are intrinsically linked to a chatroom, meaning that the mechanical semantics have to be mapped very carefully to the user experience – that is, some chatrooms look like chatrooms in the UI, whereas others look like an ad-hoc group conversation.

* I personally don’t think it matters what happens to the data on the server side; and I personally don’t think it matters if you’re running a different stack inside the browser much either, these days. The latter in particular is a matter of architectural choice, however – I suspect that as the XMPP community gains some more familiarity with HTTP/2, we’ll probably see a XMPP to HTTP/2 mapping happening.

* Matrix is entirely an optional afterthought to the web, of course. Optional afterthoughts are generally speaking good protocol design, especially if you want to have at least some possibility of people getting into the playing field. There’s really only 5 players in the browser business because the browser requirements are so “kitchen sink”, and this in turn is out of necessity because it’s a single platform standard. The XMPP community had two attempts at a conversation archive spec, XEP-0136 and MAM. Had these been baked into the core spec, I think we’d have been in a much worse shape. In years past, we’ve put together kitchen-sink specs for IM clients and servers, though – I’d personally love to see these resurrected so we can state the requirements for modern IM better.

* Yes. End-to-end crypto is hard. I don’t believe it’s ever been accomplished in a standards-based protocol. XMPP is trying, in a number of directions.

* Jingle is, fundamentally, what Google Hangouts uses. So I’d argue that it’s seen very huge uptake. It suffers a bit from interop, however I’m hopeful that WebRTC will help this (as a common set of codecs, better ICE support, etc).

* Yeah… I think this is a tough one. I think using third-party identifiers is great for getting users on-board, since users really hate sign-up. I think using them for auth is probably practical. I think making these homogeneous throughout the network is possibly practical, but not really without having infinite trust within the network.

* Oh dear lord. Yeah, syntax-fashion has been a bit of a pest. Lloyd tried to address this with XMPP-FTW, Lance with stanza.io, but it’s a point of frustration. From a protocol design angle, XML is much more convenient than JSON, but JSON has a very simple data model that maps really well to most OO-ish languages.

On the subject of message-passing versus DHTs, I’d casually point out that you can implement one in the other, but the end result differs in really fascinating ways, in particular, when you consider the privacy of endpoints. But discussing this would require a lengthy blog-post in itself, and we’re taking over Tsahi’s comments quite enough as it is. 🙂

That said, have you looked at Google Wave? I suspect that you’ve some similarity there (and Google Wave based all its federation and message passing on XMPP, despite being more more like a distributed database).

By: Matthew Hodgson

Matthew Hodgson — Fri, 14 Nov 2014 23:37:25 +0000

In reply to Aswath Rao.

I guess this one’s for Tsahi, but I’d certainly love to see a follow-up on federation signalling from Tsahi’s perspective.

From my perspective: I agree that federation is a controversial topic. Some obvious arguments against it which I’m aware of include:
* Federation only gives a lowest-common-denominator experience between different apps – by some users potentially supporting a subset of functionality the overall usability of the app can feel impacted.
* Federation can make it hard to enhance the federation API without fragmentation – any feature beyond the baseline API will by definition be fragmented
* Given launching a new WebRTC app is as simple as clicking on a link, why bother trying to minimise the number of apps you use?
* Picking an communication channel to use to interact with someone is a social negotiation and is a feature rather than a bug. When you go for dinner with someone you don’t insist on both using your favourite restaurants and then doing a lowest-common-denominator video call between them – instead you agree on a mutually favourite restaurant and use that.
* Why would existing messaging apps ever federate? WhatsApp shows you don’t need to…

However, I genuinely believe that the arguments in favour outweigh the above:
* Users should absolutely have the right to choose what comms solution they use, rather than being forced by their contacts into installing and trusting given apps or using given devices.
* Lowest-common-denominator communication is better than being forced to install and use apps that you don’t want to use, and remember which contact is on which app.
* Discovering how to contact new users is a nightmare currently – you simply don’t know what apps they have installed or what identifiers they use, or which they prefer to use.
* If users want richer domain-specific features only supported by a given app, /then/ that’s a good reason to install the target app.
* My conversation history should not be fragmented across multiple silos – as a user, I should have the right to own my data and control which service hosts it, and be able to index/archive/delete/encrypt it as I desire.
* I can still choose to use different apps by default for different experiences and social groups if I still desire – federation simply provides choice.
* Users have been lulled into thinking that the fragmentation in VoIP/IM is a feature. If email was fragmented like this, there would be rioting in the streets. Imagine if you had to install 20 televisions in your living room, each to watch different shows, each with a different remote control and different size of screen/picture quality/etc? This is where we’re at with VoIP/messaging apps currently…
* Whilst the big existing messaging apps have no particular economic reason to federate (and may even have short-term reasons not to), we think that emerging & smaller apps benefit enormously from federation. And if you glue all of those communities together, you can rapidly produce a community which rivals the existing incumbents *and* supports user choice and freedom. And this is compelling.
* The web, email and the PSTN all show that a federated network can be very compelling. Just because XMPP or SIP haven’t taken over the world doesn’t mean that the same doesn’t apply for VoIP/IM.

I could go on 😉

By: Matthew Hodgson

Matthew Hodgson — Fri, 14 Nov 2014 21:47:04 +0000

In reply to Philipp Hancke.

Good questions 🙂 Sorry for delay in response – have been travelling back from TADSummit and wanted to try to write a comprehensive answer. The main limitations of XMPP that pushed us towards Matrix were:

* We wanted an architecture where group communication has no single point of control or failure. XMPP MUCs depend on a single chat server, with no intrinsic support for horizontal scalability (especially over federation). A single server crash or disconnect should never be sufficient to destroy a room – and rooms should be able to recover gracefully after netsplits or partial data loss.

* Running a clientside XMPP stack feels unnecessary in a browser which already has an HTTP stack, especially given how performant SPDY and HTTP/2 are. Whilst XMPP-FTW provides a REST mapping for XMPP, you still end up converting everything through into XMPP in the end.

* The baseline featureset of XMPP is deliberately too minimal for many use cases (e.g. no group chat; no history; no reliable message delivery; no specific support for mobile use cases…) – and whilst there are obviously many XEPs, this introduces fragmentation as you can’t guarantee that any given client or server will actually support the extensions and provide a good federation experience. For instance, we believe message history should be a fundamental feature for any modern communication app – whereas it’s very much an optional after-thought on XMPP.

* We wanted to bake in cryptographic strong identity for both servers & clients from the outset, to avoid identity and history spoofing, support end-to-end crypto and provide good foundations for avoiding spam.

* Jingle feels unnecessarily complicated for the task of setting up WebRTC calls, and has not seen huge uptake.

* JIDs haven’t taken over the world; we wanted an architecture which primarily used 3rd party IDs (email addresses, MSISDNs, Facebook IDs etc) which get mapped through to relatively opaque internal identifiers… rather than relying on the success and ubiquity of a global identifier namespace like JIDs or SIP URIs.

* JSON is perhaps a more convenient data representation for current web developers than XML.

In the end, XMPP and Matrix federation have fundamentally different architectures despite similar use cases and superficial similarities. XMPP is all about passing stanzas around over, whereas Matrix is all about replicating crypto-signed data structures (conversation history message graphs) with eventual consistency. If anything, Matrix has more in common with an eventually-consistent globally distributed database than a message-passing system like XMPP.

This does have the disadvantage that implementing federation in Matrix is certainly more complicated than XMPP. But we hope that the features are worth it 🙂

In terms of TLS fun and games… the Matrix client-server API expects to be proxied behind whatever existing HTTPS (or HTTP, if you’re desperate) vhost you have hanging around, so we dodge the bullet there. We need to support WSGI or similar; for now the server exposes a standalone Twisted listener.

The server-server API is a bit trickier; because we mandate HTTPS for federation, the server by default spins up its own TLS listener on port 8448, which we expect folks to unfirewall and advertise via SRV. By default the server generates a self-signed certificate and publishes the public cert over HTTPS (in future via a public key server), so avoiding the need for everyone to go and buy public signed PKIX X.509 certs. Servers connecting over federation validate the public certs against the key server (or their cache of the cert). This listener can also be proxied behind an existing HTTPS vhost with an officially signed X.509 cert if needed (or a self-signed cert that has been published to Matrix). We authenticate the requests at the HTTP layer (TLS client certs being unhelpful if the TLS listener is loadbalanced).

In terms of DANE: we’d be more than happy to authenticate certs via DNSSEC rather than CA signatures, or our own “we just publish the public certs out of band” approach. We haven’t implemented it yet – patches welcome; it’d just be an extension of ways we specify to validate a cert.

In terms of POSH: honestly, I think we missed this one. It looks like a more robust version of the publish-your-cert-over-HTTPS that Matrix does already today. The fact that by default we generate per-server self-signed certs for federation means we limit the exposure of private keys to just the Matrix service in a multitenant environment, but the idea of delegating your certs to a better trusted intermediary seems smart. We’ll have to look into it properly. On the plus side, the Matrix spec (and especially the identity & crypto system) isn’t frozen yet, so we’re always very happy to take inspiration from work like this. Thanks for the pointer!

By: Aswath Rao

Aswath Rao — Thu, 13 Nov 2014 23:37:07 +0000

Would you be doing a follow-up post on what points of yours on fed sig were answered and what issues are still outstanding?