Journal logo

AI and Catastrophic Risk

Artificial Intelligence

By Global UpdatePublished about a year ago 3 min read
AI and Catastrophic Risk
Photo by Igor Omilaev on Unsplash

Since OpenAI released the very large language models Chat-GPT and GPT-4, the possible dangers of AI have gained widespread public attention. The author discusses the threats to democracy from the possibility of "rogue AIs"-powerful and dangerous AIs that would execute harmful goals irrespective of whether the outcomes are intended by humans-in this essay. In the author's view, research into safe and defensive AIs should be done by a multilateral, international network of research laboratories to mitigate against the risk that rogue AIs present to democracy and geopolitical stability.

How should we think about the advent of formidable and even superhuman artificial intelligence systems? Should we embrace them in their potential to enhance and improve our lives, or should we fear them in their potential to disempower and possibly even drive humanity to extinction? In 2023, these once-fringe questions broke into the headlines of the media, governments, and public consciousness when OpenAI first released ChatGPT and then GPT-4, unleashing an unprecedented storm of controversy and causing the Future of Life Institute in March to publish an open letter.1 That letter, which I cosigned along with many experts in the field of AI, called for a moratorium on further development of even more powerful AI systems, to give more time for analysis of the risks that those systems might pose to democracy and humanity, and for the establishment of regulatory measures to assure that such systems are developed and deployed safely. Two months later, Geoffrey Hinton and I-who, along with Yann Le Cun, won the 2018 Turing Award for our seminal contributions to deep learning-joined CEOs of AI labs, top scientists, and many others to endorse a succinct declaration: "Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war."2 Le Cun, who works for Meta, publicly disagreed with these statements.

Misalignment and Taxonomy of Harm

The first step to understand the risks associated with AI systems-very powerful ones in particular-is by describing what exactly is meant here by "misalignment." If Party A is relying on Party B to achieve a goal, can Party A succinctly describe what it is expecting Party B to do? Unfortunately, the answer is usually "no," considering all the various ways Party A might wish to specify Party B's behavior. This has been very much explored in economics, in contract theory, where Parties A and B are corporations, but also in AI, where Party A is a human and Party B, an AI system.9 If Party B is highly capable, it may follow the instructions of Party A to the letter of "the letter of the law," contract, or instructions, but still disappoint Party A by violating "the spirit of the law" or finding a loophole in the contract. This is called a misalignment, which implies that Party A's desired outcome differs from what Party B is actually optimizing.

One possible way to conceptualize AI-driven harms is by considering intentionality-whether human operators are intentionally or unintentionally causing harm with AI-and the kind of misalignment involved: 1) AI used intentionally as a powerful and destructive tool-for example, to exploit markets, generate massive frauds, influence elections through social media, design cyberattacks, or launch bioweapons-representing a misalignment between the malicious human operator and society; 2) AI used unintentionally as a harmful tool-for example, systems that discriminate against women or people of color or systems that inadvertently create political polarization-representing a misalignment between the human operator and the AI; and 3) loss of control of an AI system-usually when it is programmed or develops a strong self-preservation goal, possibly creating an existential threat to humanity-which can occur intentionally or not, and which illustrates a misalignment between the AI and both the human operator and society. Here, I focus mainly on the first and third of these, and especially on the case in which a powerful and dangerous AI tries to carry out destructive plans, whether or not the consequences were intended by anyone. I will call such an AI a "rogue AI," and I will outline some ways that humanity might try to protect itself from this possibility.

career

About the Creator

Global Update

Reader insights

Be the first to share your insights about this piece.

How does it work?

Add your insights

Comments (1)

Sign in to comment
  • Esala Gunathilakeabout a year ago

    This is huge news.

Find us on social media

Miscellaneous links

  • Explore
  • Contact
  • Privacy Policy
  • Terms of Use
  • Support

© 2026 Creatd, Inc. All Rights Reserved.