ChatGPT has started speaking

September 28, 2023

ChatGPT has just unlocked a new dimension of communication by gaining the ability to speak and engage in live conversations, in addition to analyzing images. OpenAI, the proud owner of ChatGPT, has unveiled a suite of updates for its generative artificial intelligence tool, with voice and image capabilities set to roll out for paying users worldwide.

This exciting development means that users can now engage in voice conversations with ChatGPT and even share images relevant to their discussions. OpenAI views this as a gateway to a wide array of creative and accessibility-focused applications. They suggest scenarios such as discussing interesting landmarks while traveling, determining dinner options by snapping pictures of the fridge and pantry, and aiding children with math problems by sharing hints through annotated photos.

To enable this, OpenAI has engineered a novel text-to-speech model capable of generating human-like audio from text, along with a few seconds of sample speech. Users will have the choice of five different voices, each modeled after professional voice actors. On the visual front, ChatGPT will possess the ability to annotate images, allowing users to highlight specific areas for discussion. The AI will apply its language reasoning skills to interpret and respond to photographs, screenshots, and documents.

ChatGPT has garnered its fair share of controversy since its debut less than a year ago. Concerns have been raised regarding data privacy, response accuracy, and its potential for malicious use. In this announcement, OpenAI attempts to address some of these concerns by emphasizing a gradual release strategy that allows for continuous improvement and risk mitigation. This approach becomes crucial as ChatGPT gains advanced voice and vision capabilities.

OpenAI acknowledges the potential risks, such as the misuse of these new powers by malicious actors for impersonation or fraud. To counter these concerns, the company has tailored the use cases and enlisted professional voice actors to bolster security.

These updates position ChatGPT to compete more closely with popular voice assistants like Apple’s Siri and Amazon’s Alexa. The new features will be accessible to ChatGPT Plus and Enterprise users within the next two weeks, with voice functionality available on both iOS and Android, while image capabilities will be accessible across all platforms.

In a related development, Google’s ChatGPT competitor, Bard, recently incorporated data from Google services, enabling it to read emails, summarize documents, and fact-check information.

However, ChatGPT also faces the looming threat of a potentially significant lawsuit that could compel it to purge its entire dataset and start anew. The New York Times is reportedly contemplating a legal challenge concerning ChatGPT’s use of copyrighted content. If successful, OpenAI could be required to pay substantial fines for each instance of infringing content used in training its AI system and delete the offending data.

RELATED ARTICLESMORE FROM AUTHOR

Rosatom, pioneering nuclear power for a sustainable future

Electrifying mobility: Green innovations for a sustainable future

Huawei’s success story stands as a testament to the transformative power

LEAVE A REPLY Cancel reply

RELATED ARTICLES MORE FROM AUTHOR