OpenAI now allows ChatGPT to speak, see, hear. Here’s how people can use new voice and image features

OpenAI has amped up the capabilities of ChatGPT, its generative AI bot by quite a few notches. Now, ChatGPT has the ability to not only have voice-based conversations but also to see and understand images.

This basically means that ChatGPT can now hear, speak and see with whom it is interacting.

Here’s how ChatGPT’s new features work.

Voice Conversations
Users can now enjoy dynamic and interactive dialogues with their AI assistant, unlocking a realm of exciting possibilities. Whether you’re on the move, seeking a bedtime story for your family, or settling a dinner table debate, ChatGPT’s voice capabilities are primed to assist.

To initiate voice interactions, navigate to the Settings menu in the mobile app, select “New Features,” and opt into voice conversations. Once activated, simply tap the headphone icon in the top-right corner of the home screen to choose from five distinct voices.

These voices have been meticulously crafted by professional voice actors to deliver a human-like auditory experience. Additionally, Whisper, OpenAI’s open-source speech recognition system, transcribes spoken words into text, augmenting the overall conversational quality.

Images and ChatGPT
Users can now present one or more images to ChatGPT for troubleshooting, content exploration, or complex data analysis. Whether you’re attempting to diagnose why your grill won’t start, plan a meal based on the contents of your fridge, or decode a data graph for work, ChatGPT is here to assist.

To use this feature, tap the photo button to capture or select an image. On iOS or Android, tap the plus button initially to include multiple images or employ the drawing tool to guide your assistant.

These image capabilities harness the power of multimodal models, including GPT-3.5 and GPT-4, which apply linguistic reasoning skills to a wide spectrum of visual content, encompassing photos, screenshots, and documents that contain both text and images.

Safety and Responsiveness
Voice and image capabilities will be rolled out in a phased manner to Plus and Enterprise users over the next two weeks. Voice functionality is available on both iOS and Android platforms, accessible through the settings, while image capabilities will be available on all platforms.

There is a lot of potential risks linked to these advanced capabilities. Concerning voice, the emphasis is on voice chat, and the technology has been developed in collaboration with voice actors to ensure authenticity and safety.

Regarding image input, OpenAI has taken measures to limit ChatGPT’s capacity to analyze and make direct statements about individuals to respect their privacy. Real-world usage and user feedback will play a pivotal role in further enhancing these safeguards while upholding the utility of the tool.



from Firstpost Tech Latest News https://ift.tt/zT3M8DE
Share:

No comments:

Post a Comment

Categories

Rove Reviews Youtube Channel

  1. Subscribe to our youtube channel
  2. Like our videos and share them too.
  3. Our youtube channel name Rove reviews.

WITNUX

This website is made by Witnux LLC. This website provides you with all the news feeds related to technology from large tech media industries like GSM Arena, NDTV, Gadgets 360, Firstpost and many other such ates altogether at technical depicts so that you need not go to several sites to view their post provide you advantantage of time.

From the developer
Tanzeel Sarwar

OUR OTHER NETWORKS

OUR YOUTUBE CHANNEL

ROVE REVIEWS PLEASE SUBSCRIBE

OUR FACEBOOK PAGE

The Rove Reviews

Support

Trying our best to provide you the best DONATE or SUPPORTour site Contact me with details how are you gonna help us