Are you interested in them OFFERS? Save with our coupons on WHATSAPP o TELEGRAM!

ChatGPT is now an assistant that can see, hear and speak

Chat GPT, developed by OpenAI, is introducing new capabilities that allow you to interact through voice and images, offering an intuitive interface and more ways to integrate ChatGPT into your daily life. In a recent announcement on its website, OpenAI decided to reveal these new features in advance. Likewise he also highlighted the benefits they bring and the challenges they present in the growing AI market.

ChatGPT: voice interaction

With the new voice functionality, users can have interactive conversations with ChatGPT. This allows you to use the assistant even on the move, increasing the potential of the chatbot. For example, a user could ask ChatGPT to tell a children's story while on the go, making it more enjoyable.

chatgpt voice commands
A story created by the chatbot

Or, during a dinner with friends, a debate on a specific topic could emerge; in this case, users can use the bot to obtain accurate information and resolve the debate constructively.

ChatGPT's voice technology uses a advanced text-to-speech model. In collaboration with professional voice actors, this model is able to generate humanoid audio from text and short voice samples, making interaction with ChatGPT even more natural and intuitive. Also, thanks to Whisper, an open-source speech recognition system developed by OpenAI, spoken words are transcribed into text with great precision, allowing the chatbot to understand and respond effectively to user requests.

ChatGPT: visual interaction

As above, the AI ​​model can now analyze one or more images, allowing users to solve problems, plan meals or analyze complex graphs. For example, a user could submit a photo of the contents of their refrigerator. The chatbot should therefore be used analyze the foods present and suggest recipes based on these ingredients, also providing step by step instructions for preparation.

chatgpt image commands

Read also: GPT-4: Gemini will be Google's rival. Here are the differences

Furthermore, if the user needs to focus on a particular element in the image, ChatGPT's mobile app includes a drawing tool which allows you to highlight specific areas of the image, making communication and analysis even more precise and personalized.

Image understanding is powered by the GPT-3.5 and GPT-4 multimodal models. These advanced models they apply their language skills to a wide range of images, such as photographs, screenshots and documents that contain both text and images, allowing ChatGPT to understand and interpret the visual context accurately and in detail.

It is worth mentioning that from poco OpenAI has integrated not only that canva but also DALL-E 3 in ChatGPT, or the generative image model.

When and for whom it will be available

In the next two weeks OpenAI will implement voice and images in ChatGPT for users users with Plus and Enterprise subscriptions. 

The function that allows voice interaction will be available on iOS and Android but not on the web version, which is the one used by most people.

The function that allows visual interaction will be available instead on all platforms, therefore Android, iOS and web.

Source | OpenAI

Gianluca Cobucci
Gianluca Cobucci

Passionate about code, languages ​​and languages, man-machine interfaces. All that is technological evolution is of interest to me. I try to divulge my passion with the utmost clarity, relying on reliable sources and not "on the first pass".


0 Post comments
Inline feedback
View all comments