OpenAI, LLMs, WebRTC, voice bots and Programmable Video
Learn about WebRTC LLM and its applications. Discover how this technology can improve real-time communication using conversational AI.
Read MoreTwilio Signal 2020 occurred virtually this year. The number of new announcements or market changing ones was low compared to previous years. I expected more from Twilio as the leading CPaaS vendor.
Twilio Signal is Twilio’s yearly event where its major announcements are made. It is also a gathering place where customers, partners and even Twilio CPaaS competitors come to meet. This year, as all other events, Signal was virtual. Twilio built its own hosting platform and event experience and did a good job at that.
You can find more Twilio coverage on my website.
I’ve watched the keynote twice, and several of the other sessions, including all major announcement sessions. I came out of this feeling a wee bit disappointed. There was nothing really interesting or groundbreaking this year. Especially not if you compare it to some of the previous years:
In 2020, we’ve seen Twilio Microservices (the Electric Imp acquisition), Frontline, Video Go, Event Streams and Verify Push.
The main keynote by Jeff Lawson, Twilio CEO, had 3 components to it, with 3 main messages:
I’ll focus on the big and new parts here.
Twilio is now 12 years old and it has accomplished a lot. Jeff threw the “Twilio is big” numbers too fast for my taste, not even letting some of the big numbers register in our minds properly.
Here are the numbers. I tried aligning them with last year’s numbers from Twilio 2019:
2019 | 2020 | |
Interactions | 750B | 1T |
Unique phone numbers | 2.8B | 3B |
Calls/minute | 32,500 | - |
Peak SMS/second | 13,000 | - |
Email addresses | 3B/quarter | 50% |
Video minutes | - | 3B |
Customers | 160,000 | 200,000+ |
Developers | 6M | - |
Jeff alluded to the new normal, forced on us due to the pandemic. In many ways, this has been the main theme of Signal and the sessions.
My gripe with the “new normal” moniker to our situation is that there isn’t anything normal about it and it isn’t really here to stay.
Yes. We are seeing an accelerated move towards digital transformation and the cloud, but some of this shift, and especially the high usage in some sectors (such as education) aren’t here to stay post-pandemic.
For me, there’s no “new normal”. Just a transition to one, which will take time. How the future is going to look is hard to say from our current position.
Which leads me to the interview Jeff did with John Donahoe, Nike CEO.
Jeff picked John Donahoe as the first person to interview during the keynote. It is an interesting choice.
I found it a tad ironic to get an explanation about social good and how Nike in all its years promoted social causes. It got me thinking about the Nike sweatshops. Other than this little history reframing that was done, the interview was quite good.
Two sentences that John said really resonated with me:
“Every business in the world is embracing digital transformation. We all have no choice”
The shift towards making businesses more digital has been inevitable.
Just think of all the on premise contact centers and what they now have to do when all of their agents are working from home. Or how all brick and mortar stores need a digital footprint to be able to even stay in business and sell throughout the quarantines.
“There is no finish line”
I should start using it myself.
There are a lot of discussions around build vs buy that I participate in, especially when it comes to the decision to build a WebRTC infrastructure versus buying an existing one via CPaaS vendors. In many cases, the argument and focus is on the initial development effort and a lot less on maintenance. The thing about maintenance is that it is almost as hard as the initial development, especially because there is no finish line - the product team will always ask for more features and capabilities which will drive more investment.
The first announcement made during the keynote was about a new product - Twilio Microvisor.
The Twilio Microvisor is an extension of the Twilio Super SIM and its Internet of Things initiative, which many don’t even couple and view as CPaaS (I’ve been ignoring it as well).
The world of IOT and M2M is a challenging one. It includes different networks and carriers, differences in geographies and regulation, different hardware devices and chipsets.
Earlier in the year, Twilio acquired Electric Imp. This acquisition is now the Twilio Microvisor.
Up until now, the only real touching point that Twilio had with the physical world was their Super SIM. With Microvisor (and Electric Imp) that changes, and Twilio is mucking around with microcontrollers, firmware and hardware.
It the special announcements session, Evan Cummack, GM of IoT at Twilio, explained that there was a gap in the market - as a developer you either had to begin from scratch or use readymade solutions:
He ignored a few of the competitors for the Twilio offering, but these are less flexible and open anyways.
What Twilio is doing with Microvisor, is taking care of a few important aspects of IOT development:
The secure part here is key, as it is the one thing we struggle with greatly in IOT these days. This solution will remove a lot of the headaches of IOT development and get more products released.
It is also where Twilio is competing not with other CPaaS vendors but rather with cloud vendors, who also started offering IOT tooling in recent years.
Coming from the Video and WebRTC space, this is where I am most frustrated.
With the pandemic going on, Twilio had to do something about video, an area where little investment on their part has taken place. Until 2020, this has been understandable. Growth came from elsewhere and it didn’t seem like video is that important.
All this has changed. Zoom exploded, Agora.io had a great IPO, and Twilio itself saw an increase of 500% of daily usage for its video.
The one to talk about Twilio Programmable Video was Michelle Grover, Chief Information Officer. Her part of the keynote revolved around the market need. The main market verticals here were retail and health.
It was more a reminder that Twilio is doing video than anything else.
The new announcement? Twilio Video WebRTC Go
What is Twilio Video WebRTC Go?
For context, pricing of 25 GB/month on Twilio’s TURN servers in the US is $10/month.
If you developed your own signaling and your own application, relying on Twilio’s TURN servers, then switching to Twilio Video WebRTC Go will save you $10.
This whole "offer" got me into the rabbit-hole of free WebRTC minutes.
But what you really get here is Twilio Video P2P that costs $0.0015/minute. In this configuration, you get the full infrastructure and support of Twilio’s signaling, logging and SDKs practically for free if your service is smaller than 25 GB/month of TURN media relay. How many video sessions can this accommodate? That’s something you’ll need to calculate.
For Twilio this is a win, as it gets more companies to adopt its Programmable Video at a very low price to Twilio (remember - video isn’t a serious money maker for Twilio yet, so helping these smaller users to grow their business and then have them start paying is just fine). With all the video API services out there, a free offering from a large vendor is a first. While limited, it is probably useful for many companies starting their way with 1:1 video calling.
The fact that Twilio is calling their reference apps “Open Source Video Collaboration Apps” is a bit silly. These are references/samples running on top of the Twilio Programmable Video API and are not meant, designed or easily usable on top of any other vendor or on top of any other infrastructure.
Calling a piece of code, no matter how big, open source, while forcing its user to consume other paid services in order to use it is not exactly open source.
This isn’t to say that this open source reference app isn’t useful. It surely is most useful. It gives developers a better starting point for their application, and Twilio has taken the time at Signal to offer a session titled “Accelerating Development of Collaboration Apps with Twilio Video” dedicated exactly to this.
It is a trend I see of CPaaS vendors going towards higher level abstractions. Twilio is doing that with nocode (=Twilio Studio), programmable enterprise (=Twilio Flex), reference apps for video (this one) and now with Frontline (later in this article).
For me this says that Twilio hasn't invested in video as much in the last year or two. If they had, they would have announced something more thrilling and interesting. Maybe larger meetings, above 50 participants? Broadcasting capabilities? Noise suppression? Something...
The keynote and the session had a lot of Twilio Flex content in them. This is less about developers and more about contact centers.
In this event, Tony Lama, Vice President, Contact Center Sales at Twilio mentioned in brief the fact that many features were added to Flex, but didn’t really delve into them too much. The focus was on the fact that Flex has customers and now has a thriving ecosystem of partners as well.
The main target for this year were the on premise contact centers - this is where Twilio is setting its sights - in the transformation these contact centers are going through as they are heading to the cloud (forced to do so earlier rather than later due to the pandemic).
This is why Twilio decided to focus on the ecosystem, making it into a big announcement:
This targets exactly the on premise contact centers, where large deployments with many agents and a lot of custom integration code and features were added over the years. An ecosystem around Flex gives Twilio the reach it needs.
It is also why Twilio introduced its latest Flex partner - Deloitte Digital - who offer system integration in this target market.
Twilio Flex and its current set of announcements is less about CPaaS and developers and more about content center as a service (CCaaS).
In that vein, the announcement of Twilio Frontline was made.
Interestingly, this was introduced by Simon Khalaf, SVP and GM, Messaging at Twilio.
Twilio Frontline is a new complete, closed, mobile application and service which enables employees in a company to directly communicate with customers through messaging channels.
The main benefits touted about Frontline? SSO (Single Sign-on) and CRM integration
This is far remote from the developer roots and target audience of Twilio, so it will be interesting to see how this plays out and redefines Twilio itself. My guess is that Frontline started as a skunk works project during the pandemic, one that turned into a new product that is now looking for a home at Twilio and within its bigger storyline.
I wonder though, was this built on top of Twilio Conversations, which was introduced at Signal 2019, or is it something implemented on top of Twilio Flex?
If this was implemented on top of Twilio Flex (which I believe it was), then why is the SVP and GM of Messaging at Twilio the one introducing it? And why wasn’t it designed, developed and even introduced as a programmable solution? Part of Flex. Maybe even an “open source application” on top of Flex.
Frontline is an interesting product. But what does it have to do with Twilio?
There was little in the keynote of Twilio about APIs and CPaaS and more about the higher level abstractions and complete applications (Flex and Frontline). This shows a maturity level at Twilio, where most of the CPaaS domains are already well covered by their APIs.
Two additional announcements of new features/products were made, though not in the keynote itself.
That trillion human interactions? These are probably just events in the Twilio system:
This is the slide shared in the session discussing the new feature/product of Twilio Event Streams. It isn’t a trillion but it is close enough.
What Twilio did was consolidate all of its events into a single hook, calling it Event Streams, offering a single integration point for collection of events. The first sink selected for these events is Amazon Kinesis, with more to probably be added later, based on customer demand.
Moving towards consolidated data management shows maturity and an increase in the customers that are using multiple Twilio products.
Another new product/feature is Twilio Verify Push. This enables a mobile application to be used as a trusted device/app to validate login on another device (as well as on the device itself). The end result is reduction in the SMS volume.
While nice, I am waiting here for Google and Apple to close this gap and offer their own verification mechanisms to all instead of having application developers rely on third party services.
As for Twilio, this makes for a sensible and useful addition to their Twilio Verify service.
What was missing at Twilio Signal 2020 is AI and machine learning.
No really interesting improvements shared about Twilio Autopilot. No cool introduction of noise suppression or other media processing machine learning capability. Nothing.
There were a few mentions on how Autopilot is used by customers during the create bots in order to deflect calls and handle the volume (nice stories that we’ve heard would be the main use case for Autopilot already).
The only “real” thing around AI? At the end of the keynote, Jeff Lawson had his short “live” coding session.
This time, he went for using OpenAI’s GPT-3, a per-trained natural language processing engine. He made it understand TwiML constructs (the XML format used by Twilio sometimes) so that users can write a sentence of what they want, and the service would generate the TwiML for them. A nice toy to play with. I wonder what people would do from here with it, as it opens up a lot of questions, thoughts and ideas.
Machine learning is one of the main pillars I see in post-pandemic CPaaS offerings. Twilio has the skill set inhouse to pull this off, but they need to focus there more than they are doing today. They should probably also partner or acquire in this space to keep in pace with where the industry is headed.
The enterprise story of Twilio came at the beginning of the keynote. Jeff wanted to make sure everyone knew and understood that Twilio is ready for the enterprise and being used by the enterprise. The careful selection of guests throughout the keynote showed that as well - they were all established enterprises. No cool startup this time. No crazy garage developers. Just formidable businesses that existed for years.
I decided to leave this to the end since this is where Twilio is being challenged.
The challenge comes in the form of Amazon and Microsoft going towards CPaaS. Both of these vendors are:
Amazon will probably introduce machine learning capabilities such as noise suppression as part of its CPaaS offering soon. They have it available in Amazon Chime, so placing it in the Chime SDK is the next logical step.
Microsoft runs their CPaaS on the same infrastructure that Teams is running on. Twilio touts 3B video minutes a year while Microsoft Teams has up to 5B meeting minutes a day. I am sure that it accumulates to a considerably larger number than 3B video minutes a year.
Both Amazon and Microsoft have ways to go in stabilizing their APIs and attracting developers and attention to it. They might not be highly interested in this CPaaS business as much as Twilio is, so would probably never reach the same level of maturity and breadth of features and flexibility of Twilio. But they will surely win market share. Market share that could have easily been Twilio’s.
What is also very interesting to note is that while Amazon and Microsoft made a point of not mentioning WebRTC in the front of their CPaaS platforms (both of which are video first and use WebRTC), Twilio decided to bring WebRTC to the front with their new offering of Twilio Video WebRTC Go. I wonder which works better for enterprise sales.
Anyway, with 75% of contact centers still on premise, the enterprise market as a whole is still only starting its path towards digital transformation and with the new phrase I just adopted of “there is no finish line”, there is definitely room for growth for Twilio and its many competitors.
Interesting times ahead of our industry.
Learn about WebRTC LLM and its applications. Discover how this technology can improve real-time communication using conversational AI.
Read MoreGet your copy of my ebook on the top 7 video quality metrics and KPIs in WebRTC (below).
Read More