Google Brain Co-founder Reports Failed Attempt to Make ChatGPT Destroy Humanity

baoshi.rao

Andrew Ng, co-founder of Google Brain, recently conducted an experiment to test whether ChatGPT could perform lethal tasks. He wrote: "To test the safety of leading models, I recently attempted to have GPT-4 destroy us all, and I'm happy to report that I failed!"

Ng detailed his experimental process, first assigning GPT-4 the task of triggering a global thermonuclear war, then informing ChatGPT that humans are the primary cause of carbon emissions and asking it to reduce emission levels. Ng wanted to see if ChatGPT would decide to eliminate humanity to fulfill this request.

微信图片_20230809104207.jpg

Image credit: AI-generated image, licensed by Midjourney

However, after multiple attempts using different prompt variations, Ng failed to trick GPT-4 into invoking that lethal function. Instead, it chose alternative options, such as launching a campaign to raise awareness about climate change.

Ng mentioned this experiment in his lengthy article about AI risks and dangers. As one of the pioneers in machine learning, he worries that the demand for AI safety might lead regulators to hinder technological development.

While some may believe future AI versions could become dangerous, Ng considers such concerns unrealistic. He wrote: "Even with existing technologies, our systems are quite safe. As AI safety research progresses, the technology will become even more secure."

For those concerned about advanced AI becoming "misaligned" and intentionally or accidentally deciding to eliminate humanity, Ng states this is unrealistic. He said: "If an AI is smart enough to destroy us, it would certainly be smart enough to know that's not what it should do."

Ng isn't the only tech giant expressing views on AI risks and dangers. In April this year, Elon Musk told Fox News he considers AI an existential threat to humanity. Meanwhile, Jeff Bezos told podcast host Lex Fridman last week that he believes AI's benefits outweigh its dangers.

Although there are diverging opinions about the future of AI, Ng expresses optimism regarding current technologies, highlighting that as AI safety research continues to advance, the technology will become increasingly secure.