OpenAI on Monday introduced a new AI model and a desktop version of ChatGPT. GPT-4o offers enhanced speed, multilingual support, and omnimodal functions, promising a new era in AI interaction and accessibility.
On Monday, OpenAI unveiled its latest flagship AI model, GPT-4o, alongside updates featuring a new desktop service and enhancements to its voice assistant capabilities. Mira Murati, the Chief Technology Officer, took the stage at OpenAI’s headquarters, presenting the new model as a significant advancement in AI. GPT-4o will now be available to free users, offering a faster and more accurate AI experience previously exclusive to paid customers.
“This is the first time that we are really making a huge step forward when it comes to the ease of use,” said Murati during the live demo. “This interaction becomes much more natural and far, far easier.”
The San Francisco start-up showcased a series of improvements to its GPT-4 model, including enhancements in its ability to interpret voice, video, images, and code within a unified interface. The update “provides GPT-4 level intelligence, but it’s much faster and improves on capabilities across text, vision, and audio”, stated Murati before demonstrating live voice translation across languages.
The “o” in GPT-4o stands for omni, indicating its versatility. According to Murati, the new model enables ChatGPT to handle 50 different languages with enhanced speed and quality. Moreover, it will be accessible through OpenAI’s API, allowing developers to start building applications with the new model immediately. Murati mentioned that GPT-4o is twice as fast as and half the cost of GPT-4 Turbo.
During the presentation, OpenAI team members showcased the model’s audio capabilities by using it to help calm someone before a public speech. Mark Chen, an OpenAI researcher, highlighted the model’s ability to perceive emotions and handle interruptions from users. The team also demonstrated its capability to analyse facial expressions to discern the emotions of users.
In terms of interaction, ChatGPT’s audio mode greeted users with a cheerful message. OpenAI plans to test Voice Mode in the upcoming weeks, providing early access to paid subscribers of ChatGPT Plus. The company claimed that the new model can respond to audio prompts in a conversational time frame similar to human response times.
Chen demonstrated the model’s versatility by asking it to tell a bedtime story, adjust its voice tone to be dramatic or robotic, and even sing the story. Additionally, OpenAI stated that the new model can function as a translator, including in audio mode, as demonstrated by Chen conversing with Murati in different languages.
Team members also showcased the model’s ability to solve math equations and assist in coding tasks, positioning it as a strong competitor to Microsoft’s GitHub Copilot.
“The new voice (and video) mode is the best computer interface I’ve ever used,” OpenAI CEO Sam Altman said following the announcement. “It feels like AI from the movies; and it’s still a bit surprising to me that it’s real. Getting to human-level response times and expressiveness turns out to be a big change.”
Murati said that OpenAI will launch a ChatGPT desktop app with the GPT-4o capabilities, giving users another platform to interact with the company’s technology. GPT-4o will also be available to developers looking to build their own custom chatbots from OpenAI’s GPT store, a feature that will now also be available to non-paying users.
The advent of OpenAI’s GPT-4o is set to ripple through the tech ecosystem. The model’s integration into Apple’s iPhone operating system, as reported by Bloomberg, signals a strategic partnership that could redefine smartphone AI capabilities. This collaboration could position Apple to leapfrog over competitors with a generative AI product that surpasses the functionalities of Siri.
OpenAI’s expansion and its pursuit of partnerships underscore the company’s ambition to cement its AI’s presence across various platforms. Meanwhile, the legal landscape is stirring, with OpenAI facing lawsuits from media outlets over alleged copyright violations. These legal challenges highlight the complex interplay between innovation and intellectual property rights, as publishers like the New York Times seek compensation.
The new model was initially set to be released on Tuesday to ChatGPT Plus and Team customers, followed by Enterprise clients later. Additionally, it will be accessible to free users of ChatGPT starting Monday, with usage limits. “Over the next few weeks, we’ll be rolling out these capabilities to everyone,” said Murati.
ChatGPT Plus users will enjoy five times the message capacity compared to free users, while ChatGPT Team and Enterprise clients will have even more generous usage limits.
Since its launch in November 2022, ChatGPT has set records as the fastest-growing consumer app in history, boasting approximately 100 million weekly active users. OpenAI reports that over 92% of Fortune 500 companies are utilising the platform.