Skip to content
  • Categories
  • Newsletter
  • Recent
  • AI Insights
  • Tags
  • Popular
  • World
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
  1. Home
  2. AI Insights
  3. OpenAI Demonstrates New Audio Tool Capable of Reading Text and Mimicking Voices
uSpeedo.ai - AI marketing assistant
Try uSpeedo.ai — Boost your marketing

OpenAI Demonstrates New Audio Tool Capable of Reading Text and Mimicking Voices

Scheduled Pinned Locked Moved AI Insights
ai-articles
1 Posts 1 Posters 3 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • baoshi.raoB Offline
    baoshi.raoB Offline
    baoshi.rao
    wrote last edited by
    #1

    OpenAI has revealed early test results of a feature that can convincingly read text aloud in human-like voices. This showcases a new frontier in artificial intelligence while raising concerns about deepfakes.

    The company shared early demonstrations and use cases of its text-to-speech model called "Voice Engine" during limited small-scale trials. According to a spokesperson, approximately 10 developers currently have access to the model. OpenAI introduced this feature to journalists in early March but decided against a large-scale release for now.

    An OpenAI spokesperson stated that the company scaled back the release after receiving feedback from stakeholders including policymakers, industry experts, educators, and creatives. Earlier press briefings indicated the company had originally planned to make the tool available to up to 100 developers through an application process.

    Other AI technologies have already been used to fabricate voices in certain contexts. In January, a remarkably realistic robocall impersonating President Joe Biden urged New Hampshire residents not to vote in the primary election, intensifying global AI fears ahead of critical elections. Unlike OpenAI's previous audio generation capabilities, Voice Engine can create voices that sound like specific individuals, complete with their unique tones and inflections. The software only requires a 15-second recording to replicate a person's voice.

    "With proper audio setup, it can essentially produce human-level voice quality," said Jeff Harris, OpenAI's product lead. "The technical quality is truly remarkable." However, Harris also noted, "The ability to accurately mimic human speech clearly comes with many security uncertainties."

    One of OpenAI's current development partners, the Norman Prince Neurosciences Institute under the nonprofit Lifespan health system, is using this technology to help patients regain their voices. For example, according to OpenAI's blog post, the tool was used to restore the voice of a young patient who lost the ability to speak clearly due to a brain tumor by replicating her voice from a recording she made for a school project.

    OpenAI's custom voice model can also translate generated audio into different languages. This is particularly useful for audio industry companies like Spotify Technology SA, which has already used the technology in its pilot program to translate podcasts by popular hosts like Lex Fridman. OpenAI has also promoted other beneficial applications of the technology, such as creating more diverse voices for children's educational content. In the testing plan, OpenAI requires partners to agree to its usage policy, which includes obtaining consent from the original voice owners before using their voices and informing listeners that they are hearing AI-generated voices. The company has also added inaudible audio watermarks to identify which audio was created by its tools.

    OpenAI stated that it is seeking feedback from external experts before deciding whether to widely release this feature. The company wrote in a blog post: "It is crucial for people around the world to understand the direction of this technology, regardless of whether we ultimately deploy it widely ourselves."

    OpenAI also wrote that it hopes the trial of its software will "inspire the need to enhance societal resilience" to address the challenges posed by more advanced AI technologies. For example, the company has called on banks to gradually phase out voice authentication as a security measure for accessing bank accounts and sensitive information. It also seeks to conduct public education to help people recognize deceptive AI content and develop more technologies to detect whether audio content is AI-generated.

    1 Reply Last reply
    0
    Reply
    • Reply as topic
    Log in to reply
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes


    • Login

    • Don't have an account? Register

    • Login or register to search.
    • First post
      Last post
    0
    • Categories
    • Newsletter
    • Recent
    • AI Insights
    • Tags
    • Popular
    • World
    • Groups