Summarize this content to 2000 words in 6 paragraphs
Satya Nadella is a fan of multimodal AI interfaces — the ability to interact with a chatbot not just through text but through voice, for example — so much so that it has completely changed the way he “listens” to podcasts.
Speaking recently on the Minus One podcast from South Park Commons, the Microsoft CEO said he has set the Action Button on his iPhone with Apple CarPlay to activate Microsoft Copilot voice mode. That allows him to easily engage with Microsoft’s AI in the car, including an alternative way of consuming podcasts.
“The best way for me to consume podcasts is not to actually go listen to it but to have a conversation with the transcript on my commute using my Copilot. Who’d have thought?” he said.
“But it is more convenient because of the modality, the fact that I can speak to it, I can interrupt it,” he said. “Think about it, right? This full-duplex conversation which was never possible — that is a fantastic new modality. … There’s no going back.”
This resonates with me. I’ve done this not just with videos and podcasts but also with entire books (as a way of refreshing my memory before an interview with an author for example).
Having a conversation with a transcript like this is doable in Microsoft Copilot and other AI tools, particularly if the podcasts are on YouTube, with transcripts.
In Microsoft Copilot, for example, one way would be to start the conversation on a computer in the Edge sidebar before leaving the house, and then continue it in the Copilot app in the car, assuming you’re logged into your account in both places. You could do something similar in ChatGPT or other AI tools.
But it’s not always seamless, or at least not obvious to most users. It can also be tough to find the transcript for a podcast, depending on where it’s published. It makes me wonder if there’s a startup opportunity here.