AI

ChatGPT Expands New Voice and Image Functions Feature

OpenAI, the company led by Sam Altman, has updated its great flagship development: ChatGPT. The popular chatbot has seen its capabilities expanded, and from now on, ‘he will see, hear and speak.’ Specifically, as corporate sources have advanced, “we are beginning to deploy new voice and image functions in ChatGPT.” An update that provides a more intuitive type of interface by allowing you to have a voice conversation or show ChatGPT what you are talking about. “Voice and image give you more ways to use ChatGPT in your life. Take a photo of a point of interest. At the same time, you travel and have a live conversation about what you find interesting. When you are at home, take photos of the refrigerator and pantry to know what’s for dinner,” they exemplified.

Thus, in the next two weeks, Plus and Enterprise users will be able to enjoy the voice and image functions of ChatGPT. In this sense, it is worth noting that while voice will be available on iOS and Android, images will be enabled on all platforms.

Gradual rollout

The objective of OpenAI is to build “safe” and “beneficial” artificial intelligence. Precisely for this reason, they believe that their tools should be made available gradually, as this “allows us to introduce improvements and refine risk mitigation over time while preparing everyone for more powerful systems in the future.” This strategy, they have stressed, is even more important with advanced voice and vision models.

The new voice technology – capable of creating realistic synthetic voices from a few seconds of real speech – “opens the doors to many creative and accessibility-focused applications .” However, these capabilities also present new risks, such as the possibility of malicious actors impersonating public figures or committing fraud.

That is why they have chosen to use this technology for a specific use case: voice chat. “The voice chat has been created with voice actors that we have worked with directly.” However, they are also collaborating in similar ways with others. For example, Spotify is using the power of this technology to pilot its voice translation feature, which helps podcast creators expand the reach of their storytelling by translating podcasts into other languages ​​with their own voice.

But vision models also bring new challenges to the table, ranging from speculation about people to confidence in the model’s interpretation of images in high-risk areas. “Before widespread deployment, we tested the model with risk red teams in areas such as extremism and scientific competition and with a diverse set of alpha testers.” In this sense, they say, the research allowed them to focus on some key details for responsible use.

A useful and safe vision

Like other ChatGPT features, Vision aims to help users in their daily lives. And he does it best when he can see what you see. This approach has been based directly on his work with Be My Eyes, a free mobile application for the blind and people with low Vision, to understand its uses and limitations. “Users have told us that they find it valuable to have general conversations about images where people appear in the background, for example, if someone appears on TV while you are trying to figure out how to adjust the remote control.”

They have also taken technical steps to significantly limit ChatGPT’s ability to analyze and make direct claims about people since ChatGPT is not always accurate. These systems must respect people’s privacy. “Real-world use and feedback will help us further improve these safeguards without the tool ceasing to be useful,” they defended.

Transparency about model limitations

Users can depend on ChatGPT for specialized topics, for example, in fields such as research. “We are transparent about the limitations of the model and discourage higher risk use cases without proper verification.” In addition, they have highlighted the model is competent in transcribing texts in English but does not work well with other languages, especially those that do not have Roman script. “We advise our non-English speaking users not to use ChatGPT for this purpose.”

techgogoal

TechGogoal updates all the Information from the levels of Technology, Business, Gadgets, Apps, Marketing, Social Networks, and other Trending topics of Innovative technology.

Recent Posts

Tips to Avoid Mistakes When Buying a Second hand Laptop

Buying a second hand laptop can be quite an quest, and there are many times…

1 week ago

Augmented Reality in Retail and Why You Need It in 2024

Augmented Reality is an immersive technology that enhances product presentation in retail by overlaying digital…

3 weeks ago

Most Important SEM Trends for 2025

When it comes to shopping, every user turns to Google at one point or another,…

4 weeks ago

Drawing:acotuuvra54= harry potter – Unlimited Magical Art

Drawing:acotuuvra54= harry potter is related to a Harry Potter character art; you can also call…

1 month ago

What are Long Tail Keywords in SEO Keyword Research

In this article, we explain what long-tail keywords are and why it is important to…

2 months ago

Optimize Advertising Campaigns with New Meta Updates

New Meta updates have arrived that will transform the way we manage and optimize advertising…

2 months ago