Does AGI Really Threaten the Survival of the Species?

Have you heard the news? Artificial general intelligence, or AGI, is going to kill us all. Even people living in the world’s most remote regions — the Amazon rainforest, northern Siberia and Antarctica. According to “AI doomers,” a “misaligned” AGI will annihilate the entire human population not because it hates us or has a ghoulish lust for omnicidal violence, but simply because human beings “are made out of atoms which it can use for something else.” We are resources that it will utilize to achieve its ultimate goals, whatever they happen to be. Or so the “doomers” claim.

This idea of AGI posing an “existential risk” to humanity has gained a lot of steam following the release of ChatGPT last November. Geoffrey Hinton, the “godfather of AI,” has expressed this worry, as has the self-described “decision theorist” Eliezer Yudkowsky, who argued in Time magazine that we should be willing to risk thermonuclear war to prevent AGI in the near future. A more recent article from OpenAI, the for-profit company behind ChatGPT and GPT-4, cites “existential risks” in calling for an international agency to “regulate” AGI research.

But just how plausible are these “existential risks” from AGI? What exactly is an “existential risk,” and why is it so important to avoid?

The first thing to note is that “existential risk” has both colloquial and canonical definitions. In the colloquial sense, it simply refers to the destruction of our species, resulting in total human extinction, whereby Homo sapiens disappears entirely and forever. I suspect this is what most people think of when they hear the term “existential risk,” which makes sense given that “existential” means “of or relating to existence.”

But just how plausible are these “existential risks” from AGI? What exactly is an “existential risk,” and why is it so important to avoid?

The canonical definition is far more controversial because it’s bound up with the “TESCREAL bundle” of ideologies that I discussed in my last article for this series. To recap briefly, the acronym “TESCREAL” stands for — and prepare yourself for a mouthful here — transhumanism, Extropianism, singularitarianism, cosmism, Rationalism, Effective Altruism and longtermism. The clunky acronym is important because these ideologies are hugely influential within Silicon Valley and the tech world more generally. Elon Musk is a TESCREAList, as is Sam Altman, the CEO of OpenAI, which Musk and Altman co-founded with money from Peter Thiel and others. Understanding this bundle is key to making sense of what’s going on with the tech elite, and it’s what gave rise to the canonical definition of “existential risk” in the early 2000s.

On this definition, “existential risk” refers to any event that would prevent us from fulfilling “our longterm potential” in the universe. What does this mean? According to TESCREALists, fulfilling our “potential” would involve things like — and this is not an exaggeration — creating a new race of superior “posthumans,” colonizing the universe, building giant computer simulations full of trillions and trillions of “happy” digital people and ultimately generating “astronomical” amounts of “value” over the coming “millions, billions and trillions of years.” Put differently, the canonical definition of existential risk is anything that ruins our chances of realizing a techno-utopian world among the heavens populated by enormous numbers of digital posthumans awash in — as Nick Bostrom put it in his “Letter from Utopia” — “surpassing bliss and delight.”

This is what many TESCREALists mean when they talk about “existential risks.” It’s why they argue that existential risks “must be avoided at any cost,” in Bostrom’s words, as failing to avoid them would mean the loss of utopia and astronomical value. It’s why Yudkowsky argued in his Time article that we should be willing to risk an all-out thermonuclear war to prevent apocalyptic AGI from being developed. The reasoning goes like this: thermonuclear war probably wouldn’t completely destroy our “vast and glorious” future in the universe, because there would almost certainly be some survivors who could rebuild society. According to one recent study, if a thermonuclear war were to break out between the U.S. and Russia tomorrow, a reassuring 3 billion people would remain — plenty enough to keep industrial civilization, our springboard to outer space, roaring.

In contrast, doomers like Yudkowsky believe that an AGI apocalypse would kill every last person on Earth, thus forever erasing the techno-utopian world that we could have otherwise drawn. An AGI apocalypse would be an existential catastrophe, whereas a thermonuclear war almost certainly wouldn’t. Hence, when someone on Twitter posed the question, “How many people are allowed to die to prevent AGI?” Yudkowsky offered a jaw-dropping response:

There should be enough survivors on Earth in close contact to form a viable reproductive population, with room to spare, and they should have a sustainable food supply. So long as that’s true, there’s still a chance of reaching the stars someday.

This points to why I’ve argued that you shouldn’t care about “existential risks” in the canonical sense: the utopian vision at the heart of this concept is a total nonstarter, and could even be dangerous. Who cares if 10^58 digital people — Bostrom’s actual estimate — living in huge computer simulations spread throughout the universe exist or not? Many TESCREALists would say that the nonbirth of these people would constitute a horrendous moral catastrophe. But would it? I’d say no, and I think most people agree with me. If this “utopia” is a nonstarter, then who cares about “existential risks” on the canonical account?

Yet the situation is more insidious than this, because both the pursuit and realization of this “utopia” would almost certainly have catastrophic consequences for most of humanity. Consider OpenAI. As noted above, the company was founded by TESCREALists and is now run by one. Its explicit mission is “to ensure that artificial general intelligence benefits all of humanity.” Yet the large language models (LLMs) behind ChatGPT and GPT-4 were built on massive amounts of intellectual property theft. This is why a number of lawsuits have recently been filed against OpenAI. Even more, in curating the training data for its LLMs, OpenAI hired a company that paid Kenyan workers as little as $1.32 per hour to sift through “examples of violence, hate speech and sexual abuse” on the Internet, which left some workers traumatized. Does this look like “benefitting all of humanity”? Obviously not, which illustrates how the pursuit of “utopia” will leave a trail of destruction in its wake: Some people, perhaps a very large number, are going to get trampled in the march to paradise.

If this “utopia” is a nonstarter, then who cares about “existential risks” on the canonical account?

Now let’s imagine that TESCREALists succeed in creating their “utopia.” Who exactly is this utopia for? Will some people get left out? If so, which people? One of the most striking things about the TESCREAL literature is that there’s virtually zero consideration of what the future should look like from nonWestern perspectives. When TESCREALists at the influential Future of Humanity Institute, founded by Bostrom in 2005, talk about “the future of humanity,” they aren’t talking about the interests of “humanity” as a whole. Rather, they’re promoting an extremely narrow, thoroughly Western vision founded on the Baconian dream of plundering the universe and the capitalistic desideratum of maximizing economic productivity to the limits.

In fact, Bostrom once literally defined an “existential risk” as anything that would prevent us from reaching “technological maturity,” defined as “the attainment of capabilities affording a level of economic productivity and control over nature close to the maximum that could feasibly be achieved.” By subjugating nature and maximizing productivity, he reasons, we would be optimally positioned to realize the TESCREAL vision of a posthuman paradise. Absent is any reference whatsoever to alternative conceptions of the future from the perspective of, say, Muslims, Afrofuturism, feminism, queerness, disability and so on. Also missing is any discussion of which future might be best for the nonhuman creatures with whom we share planet Earth. Prominent TESCREAList William MacAskill even argues that our destruction of the natural world might very well be net positive. Why? Because wild animals suffer, and fewer wild animals means less wild-animal suffering. In his words,

if we assess the lives of wild animals as being worse than nothing on average, which I think is plausible (though uncertain), then we arrive at the dizzying conclusion that from the perspective of the wild animals themselves, the enormous growth and expansion of Homo sapiens has been a good thing.

Far from being “utopia,” the grandiose fantasies at the heart of TESCREALism look more like a dystopian nightmare, and for many groups of people it might even entail their elimination. Almost every imagined utopia leaves some people out — that’s the nature of utopia — and so far as I can tell, if the TESCREAL vision were fully realized, most of humanity would lose (to say nothing of the biosphere more generally).

The canonical definition of “existential risks” is more than a nonstarter, then: it’s built around a deeply unappealing vision of the future designed by privileged white men at elite universities and within Silicon Valley, which they now want to impose on everyone else, whether we like it or not.

But what of “existential risks” in the colloquial definition, simply meaning “the annihilation of every human on Earth”? Should you care if AGI poses an existential risk in this sense? The answer is “Of course you should, since human extinction would mean that you, your family and friends and everyone else would die a terrible death.” Does this mean that we really ought to take Yudkowsky and the other “doomer” TESCREALists seriously?

The answer depends on how plausible an AGI apocalypse is. Not that long ago, I found the arguments for why a “misaligned” AGI would destroy us somewhat convincing, although I have recently changed my tune. A major problem with these arguments is that it’s not clear how exactly an AGI could actually kill everyone on Earth. In his 2014 bestseller “Superintelligence,” Bostrom outlines a situation in which AGI “strikes” humanity “through the activation of some advanced weapons system that the AI has perfected.” He continues:

If the weapon uses self-replicating biotechnology or nanotechnology, the initial stockpile needed for global coverage could be microscopic: a single replicating entity would be enough to start the process. In order to ensure a sudden and uniform effect, the initial stock of the replicator might have been deployed or allowed to diffuse worldwide at an extremely low, undetectable concentration. At a pre-set time, nanofactories producing nerve gas or target-seeking mosquito-like robots might then burgeon forth simultaneously from every square meter of the globe.

Yudkowsky points to a different set of possibilities. After being asked how an AGI could obliterate humanity so that there are “no survivors,” he speculates that it could synthesize a pathogenic germ that “is super-contagious but not lethal.” Consequently,

no significant efforts are [put] into stopping this cold that sweeps around the world and doesn’t seem to really hurt anybody. And then, once 80% of the human species has been infected by colds like that, it turns out that it made a little change in your brain somewhere. And now if you play a certain tone at a certain pitch, you’ll become very suggestible. So, virus-aided, artificial pathogen-aided mind control.

Realizing that this wouldn’t necessarily kill everyone, Yudkowsky then proposed a different scenario. Imagine, he says, that an AGI hell-bent on destroying us creates,

something that [can] reproduce itself in the air, in the atmosphere and out of sunlight and just the kind of atoms that are lying around in the atmosphere. Because when you’re operating at that scale, the world is full of an infinite supply of … perfect spare parts. Somebody calculated how long it would take “aerovores” to replicate and blot out the sun, use up all the solar energy. I think it was like a period of a couple of days. At the end of all this is tiny diamondoid bacteria [that] replicate in the atmosphere, hide out in your bloodstream. At a certain clock tick, everybody on Earth falls over dead in the same moment.

Does any of this make sense, or does it sound like utter madness? As Carl Sagan famously declared, extraordinary claims require extraordinary evidence, and the doomer’s claim that a “misaligned” AGI will kill the entire human population is about as extraordinary as they get. So where is the evidence? The scenarios above are hardly convincing.

A sci-fi rendering of AGI contemplating ever-more elaborate schemes for harvesting human bodies. Image: Adobe

The doomers have two responses to this. The first is based on the third of Arthur C. Clarke’s “three laws,” which states: “Any sufficiently advanced technology is indistinguishable from magic.” We can illustrate this by analogy: picture yourself traveling back in time, convincing a Neanderthal to hop in the time machine with you, and then zipping back to the present. Imagine what this Neanderthal would think about our world. Airplanes, brain surgery, smartphones and GPS would all appear to be “magic.”

AI doomers could use this to claim that: “Look, maybe we can’t come up with a plausible way that AGI would kill everyone. But that’s no reason to reject our doomsday hypothesis. By definition, a superintelligent AGI would be unimaginably ‘smarter’ than us. This would enable it to create super-advanced technologies, and super-advanced technologies would be able to manipulate and rearrange the world in ways that are, from our limited human point of view, indistinguishable from magic.”

But if we’re talking about a “god-like AI” that traffics in “scientific magic,” then literally anything goes. There are no longer any rules to the conversational game, no sense to make of this post-AGI world. Maybe the AGI magically destroys us — but since we’re in the land of magic, maybe it doesn’t. The doomers would point out that lots of potential scenarios all lead to the conclusion that the AGI would try to kill us, but the arguments for this claim are all formulated within mere human frameworks. Perhaps these frameworks are wrong in ways that we simply cannot understand, just as a dog will never make sense of its owner’s behavior by extrapolating its own thoughts and instincts into the realm of human action. If we’re talking about magic beyond our comprehension, then there’s just nothing meaningful to say, and hence these doomsday worries aren’t based on anything intelligible.

Far from being “utopia,” the grandiose fantasies at the heart of TESCREALism look more like a dystopian nightmare, and for many groups of people it might even entail their elimination.

The second response to the “extraordinary evidence” objection involves “expected value.” AI doomers might say that even if these apocalyptic scenarios are extremely unlikely, the losses that our extinction would entail are so great that we should still take them very seriously. In other words, if a risk is the probability of an outcome multiplied by its badness, and if the badness of an outcome is enormous, then the risk could be very large even if the probability is miniscule. Maybe “target-seeking mosquito-like robots” and atmospheric self-replicating “diamondoid bacteria” are highly improbable, but the stakes are so massive — everyone on Earth dying — that we should still worry about the AGI apocalypse.

The problem with this is that the very same argument could be used to support all manner of absurd conclusions. What if I told you that the next time you sneeze, the universe will explode in a giant, violent conflagration? You might think that this warning is ridiculous, lacking any explanation of a causal mechanism. But if you take seriously the “expected value” definition of “risk” mentioned above, then you might have a pretty good “reason” never to sneeze again. In fact, TESCREALists are aware of this problem: they call it “Pascal’s mugging,” which describes a situation in which a mugger says that if you don’t give him $5, he’ll torture trillions of people in a parallel universe. Although his claim is almost certainly nonsense, you can’t know for sure that he’s lying — the universe is, after all, a very strange place! So, you crunch the numbers, and ultimately hand over your $5.

The mugger used expected value theory as his weapon, and it worked.

Where does all this leave us? On the one hand, you shouldn’t care about “existential risks” in the canonical sense used by TESCREALists. This is because the concept is intimately bound up with a deeply problematic vision of utopia built on Western, Baconian-capitalistic values of exploitation, subjugation and maximization. Not only is this vision of 10^58 people in huge computer simulations a complete nonstarter, but its pursuit and realization would be catastrophic for most of humanity. The utopian world of TESCREALism would be a dystopia for everyone else.

Maybe the AGI magically destroys us — but since we’re in the land of magic, maybe it doesn’t.

On the other hand, you should care about the possibility of an AGI causing an “existential” catastrophe if understood colloquially as total human extinction. The problem, though, is that there’s no plausible account of how an AGI could realistically accomplish this, and claiming that it would employ “magic” that we just can’t understand essentially renders the whole conversation vacuous, since once we’ve entered the world of magic, anything goes. To repurpose a famous line from Ludwig Wittgenstein: “What we cannot speak about we must pass over in silence.”

This is why I’ve become very critical of the whole “AGI existential risk” debate, and why I find it unfortunate that computer scientists like Geoffrey Hinton and Yoshua Bengio have jumped on the “AI doomer” bandwagon. We should be very skeptical of the public conversation surrounding AGI “existential risks.” Even more, we should be critical of how these warnings have been picked up and propagated by the news, as they distract from the very real harms that AI companies are causing right now, especially to marginalized communities.

If anything poses a direct and immediate threat to humanity, it’s the TESCREAL bundle of ideologies that’s driving the race to build AGI, while simultaneously inspiring the backlash of AI doomers who, like Yudkowsky, claim that AGI must be stopped at all costs — even at the risk of triggering a thermonuclear war.

Your support matters…

Independent journalism is under threat and overshadowed by heavily funded mainstream media.

You can help level the playing field. Become a member.

Your tax-deductible contribution keeps us digging beneath the headlines to give you thought-provoking, investigative reporting and analysis that unearths what's really happening- without compromise.

Give today to support our courageous, independent journalists.

SUPPORT TRUTHDIG

In this Article: