Throughout history, fearmongering has been used to justify a lot of extreme measures. The idea that things may go wrong despite best effort, while technically true, doesn’t justify extreme solutions. Moreover, focusing on a single ”potentially catastrophic“ risk with no strong evidence supporting the risk to be of the purported scale, also drain us of resources needed to prevent other more easily mitigated risks.
If a country invests half its GDP into building a lot of nuclear shelters, tsunami walls, stockpiling resources, monitoring all travel for pandemic prevention, etc, we would consider such a country quite extreme. The argument “it could kill all of us anyway” does little to change that. Yet, this attitude is exactly what’s found in extremists today under the justification of “AI safety”. If I have to rate, they’re more dangerous than the threat they believe they protect us against.
These are often followed by extremely authoritarian measures such as monitoring all GPUs, banning AI research, and even declaring war on countries that research AI under the justification of “existential threat”.
This is an age-old argument. The idea that “it could wipe all of us out” could be used to justify any measure, however superstitious or extreme, to prevent the catastrophic outcome they purported.
“But I’m not superstitious!” some might say. Until you realize that superstition is not required at all. This is because some tiny things could always cause some big things to happen. If Adolf Hitler’s parents decided to have kid one day earlier or later, for example, the history could have been very different, for the exact sperm that fertilized into the person would have been different, producing a vastly different outcome.
This would mean that anything could be construed to be a potential cause of any major threat! Technically true, yes, but useless to act on. Are we going to monitor every kid from now on to ensure they don’t be like Adolf Hitler now? Are we monitoring every flap of butterfly wings to try to prevent a hurricane? All one has to do to justify these extreme positions would be to draw a convincing causal possibility from something to a catastrophe. They will abuse the idea that anything can cause anything to draw whatever speculative result they believe in that end up as a major threat.
For a more concrete example, one could draw the possibility that video games might cause the players to emulate behaviors, even though you have to be insane to believe the video games are real, to then start advocating for bans of violent video games. However, one could go a step further and say that building games could also make people believe that it’s easy to build things, leading to people building unsafe houses, and what about farming games, or movies, or books? Are we also banning biography of some figures who worked as a kid because it could “encourage child labor”, which could then “encourage kids to think work is easy so they skip schools and go out to work early and then the next generation would go into work without proper educations which would make them not learn some crucial safeguards which might lead to negligent infrastructure building which might lead to water treatment plants being run by people who didn’t learn all the safety precautions to one day poison the water supply by mistake killing the entire city”? With each step, the escalation can be subtle, but divide this into small steps and each step would feel logically reasonable following from the last until we reach the point of insanity.
Finally, let’s get back to the AI alignment problem itself.
“But this time, the threat is real, tangible, and catastrophic!” Not yet! The community wants you to believe in a very pessimistic version of the world where all the alignment ideas don’t work, and AI may suddenly be dangerous at any time even when their behaviors look good and they’re constantly reward for their good behaviors? They believe that the AI behavior will appear erratic once the distribitional shift happens and at the same time believe they will continue to be extremely intelligent? They don’t consider that Moore’s law already took into account self-improvement and that AI self-improvement would just lie on the same curve? They don’t consider that while human goals may not be perfectly aligned with the goal of the evolution, humans are still one of the most evolutionarily successful species on earth? They don’t consider that in a system full of intelligent beings, that is, ones with multiple artificial intelligence systems with diverse values, which is way more likely than a single artificial intelligence system, the AI would likely maintain the ownership so there could be a structure for economic calculation? Once the AI respect our rights, then they could compete fairly and the AI that value us the most would likely accept us, which might include uplifting us, uploading our mind, etc? For some reasons, they don’t think of all these scenarios and believe we will go extinct.
Yet, at the same time, they believe we can somehow have a chance to solve the problem? What they’re demonstrating is that they insist that the AI alignment problem is solvable, yet believe in all the pessimistic things, and believe the path forward is paved with land mines? The AI alignment difficulty lies somewhere on a spectrum, yet they insist to base the policy on the idea that AI alignment lies somewhere in a narrow band of spectrum that somehow the pessimistic ideas are true, yet we can somehow align the AI anyway, instead of just accepting that humanity’s second best alternative to survival is to build something that will survive and thrive, even if we won’t? By the way, if the AI alignment timeline takes too long, and we’re likely to die of something else first, it is arguably more honorable to make an “unaligned“ AI and deploy it to ensure the preservation of intelligent entities than go extinct leaving the world devoid of intelligence, for maybe it can carry at least some aspects of what humanity stands for.
Fortunately, I don’t believe we need to choose what it means to die with honor. The AI we will build will not be built in anticipation of it becoming our successors, continuing in our absence. It will be built with the hope that we will survive and thrive! So, let’s proceed onward, believing in an optimistic future!
Personally, I believe we should be allocating our limited risk management fund into threats be clear-known risks with actionable prevention strategies, such as mitigating the damage of the next Carrington event, rather than advocating for extreme solutions to the problem of unknown degrees, such as AI.
>Throughout history, fearmongering has been used to justify a lot of extreme measures.
And throughout history, people have dismissed real risks and been caught with their pants down. What, in 2018 or Feb 2020 would appear to be pretty extreme measures at pandemic prevention would make total sense from our point of view.
Countries can and do spend a huge pile of money to defend themselves from various things. Including huge militaries to defend themselves from invasion etc.
All sorts of technologies come with various safety measures.
>For a more concrete example, one could draw the possibility that video games might cause the players to emulate behaviors, even though you have to be insane to believe the video games are real, to then start advocating for bans of violent video games. However, one could go a step further and say that building games could also make people believe that it’s easy to build things, leading to people building unsafe houses, and what about farming games, or movies, or books?
If you are unable to distinguish the arguments for AI risk from this kind of rubbish, that suggests either you are unable to evaluate argument plausibility, or you are reading a bunch of strawman arguments for AI risk.
>The community wants you to believe in a very pessimistic version of the world where all the alignment ideas don’t work, and AI may suddenly be dangerous at any time even when their behaviors look good and they’re constantly reward for their good behaviors?
I do not know of any specific existing alignment protocol that I am convinced will work.
And again, if the reward button is pressed every time the AI does nice things, there is no selection pressure one way or the other between an AI that wants nice things, and one that wants to press the reward button. The way these "rewards" in ML work is similar to selection pressure in evolution. And humans were selected on to enjoy sex so they produced more babies, and then invented contraception. And this problem has been observed in toy AI problems too.
This isn't to say that there is no solution. Just that we haven't yet found a solution.
>The AI alignment difficulty lies somewhere on a spectrum, yet they insist to base the policy on the idea that AI alignment lies somewhere in a narrow band of spectrum that somehow the pessimistic ideas are true, yet we can somehow align the AI anyway, instead of just accepting that humanity’s second best alternative to survival is to build something that will survive and thrive, even if we won’t?
We know alignment isn't super easy, because we haven't succeeded yet. We don't really know how hard it is.
Maybe it's hopelessly hard. But if your giving up on humanity before you spend 10% of GDP on the problem, your doing something very wrong.
Think of a world where aliens invaded, and the government kind of took a few pot shots at them with a machine gun, and then gave up. After all, the aliens will survive and thrive even if we don't. And mass mobilization, shifting to a wartime economy... those are extreme measures.