We are getting closer to an AI that behaves with the same ease as the one that stars in the movie Her – it’s on Filmin and Prime Video. Yesterday OpenAI presented a new language model, GPT-4o ( o d’ o ) that interacts from image and voice, and responds instantly, with the speed of a human, in a way that is apparently indistinguishable from ‘a person.
OpenAI defines the new model as “a step forward towards a much more natural interaction between the human being and the computer”. GPT-4o can be asked to interact from any combination of text, audio and image, and in response it generates any combination of text, audio and image. The most surprising thing is its speed. It answers questions in just 232 milliseconds, just like a human, so you hold conversations naturally.
Through the ChatGPT app, the AI ??can access an image from the mobile camera or a screenshot, and also the user’s voice from the microphone. The same can be done with the computer screen, on which you can be shown, for example, a fragment of programming code so that you discover, firsthand, where the errors are.
GPT-4o can act as a simultaneous translator in 50 languages, is able to use different tones of voice and even sing. OpenAI’s chief technology officer, Mira Murati, and two of the company’s programming engineers showed several examples of its capabilities.
One of them showed his face to the camera and asked ChatGPT to try to tell him what emotions he was feeling. “You seem to be feeling quite happy and cheerful, with a big smile and maybe even a touch of excitement. Whatever you’re going through, you seem to be in a very good mood. Share the source of those good vibes,” he replied.
“The reason why I’m in a very good mood – said the engineer – is because we were making a presentation showing how useful you are”. “Oh, stop. You’re making me blush,” replied the AI.
OpenAI started deploying the text and image functions of GPT-4o in ChatGPT as of yesterday. AI will be available in the free tier, although paid users will have up to five times the message limit. The company announced that it will release a new preliminary version of voice mode with GPT-4o in ChatGPT Plus in the coming weeks.