Contents of this post:
I’m a bit puzzled by public discourse on AI safety. It is great that the general topic has now reached mainstream media - merely a year ago this was very much a fringe topic. Nowadays it has become a primary discussion point in high-level political gatherings and has inspired numerous op-eds in prominent publications.
Nevertheless, most discussions advocating for increased AI safety recommend either „more alignment research“, stringent regulation, or total abstinence. While I believe that alignment research is a valuable endeavour, I am sceptical about the overall effectiveness it will have. As for regulation, I am in favour of transparency legislation that ensures visibility into ongoing AI research and training. Heavy handed regulation on the other hand could obstruct very important gains, such as making LLMs confabulate/hallucinate less. Since these main recommendations seem limited to me, I think we should consider more scenarios and contemplate possible short term actions.
Let us entertain the idea that superhuman AI agents will become possible within the next decade. I would characterize such systems as computer systems that can interface with the internet and that, unlike today‘s LLMs, can make effective long term plans, learn continuously from single counterexamples and perform basically almost all tasks that a human can do via a computer at a higher level than the average human. At the same time they would of course be faster and would be very proficient both in coding and hacking computer systems.
I want to entertain this idea not because I think it is a very likely outcome, but rather because I think it is within the realm of the possible, given the wide uncertainty about the development of AI in the near to medium future. Additionally, I would not have predicted today’s state of AI correctly ten years ago (which I think is true for a very large percentage of humanity) and so I want to consider a wider set of outcomes going forward.
If we take the above possibility as a possible trajectory then what could we do today to make that world a lot better and safer? As the title of this post suggests, I think the number one thing that we should be discussing is how to outlaw or at least strongly limit the number and capabilities of remote controlled and autonomous weapons.
Riding with Death - Jean-Michel Basquiat, 1988
This idea had a bit of traction in public discourse a few years ago when petitions like the Lethal Autonomous Weapons Pledge from the Future of Life institute were created. However, it seems to have lost momentum since then. I also believe that for AI safety concerns, autonomous and remote controlled weapons are equivalent. If powerful AI systems were very capable of hacking computer systems, then they could take over remote controlled weapons just as well as autonomous weapons, making both of them very dangerous.
Imagine that a very capable AI decides to bomb the White House or the EU parliament. It might want to do so either because it was deliberately instructed to do this by a malicious actor or because by pursuing some other optimisation goal it thinks that it would be more successful if the WH or EU parliament or whatever else would not exist (similar to the paper clip maximiser idea).
A world in which the military bases of the world are full of remote controlled armed drones or artillery is one in which this AI would have a much higher probability of succeeding. If dropping bombs or firing weapons is something that only humans at the site can do then it would be much harder (and hopefully close to impossible) to execute such a plan. A human pilot would hopefully not bomb their own seat of government even if they were sent plausible sounding commands to do so.
I am aware of course that with the predator drones that the US has employed in Iraq and Afghanistan, and the Turkish and Iranian drones employed by both sides in the war in Ukraine, we are well into this world of remote controlled armed weapons.
Militaries around the world along with their Defence industries suppliers will undoubtedly present many arguments in favour of remote controlled weapons. They will claim that of course their systems will be impossible to hack. They will argue that even if their country would voluntarily forgo such systems, their potential enemies would not.
I believe that the best scenario would involve securing a comprehensive international treaty banning remote-controlled and autonomous weapons. I believe this is a goal worth fighting for.
Sans titre (Boxer) - Jean Michel Basquiat, 1982
However, even if this turns out to not be realistic, I would like to see the public discourse on AI safety place greater emphasis on this issue and attempt to shift the Overton window. It would be highly valuable to put global militaries on the defensive in this regard. At the very least, they should maintain a strictly limited arsenal of remote-controlled weapons; establish protocols that necessitate manual arming of weapons, exclusively performed by humans through unhackable command chains (i.e. as close to as we can get); and ensure that the procurement of remote-controlled weapons undergoes far greater oversight compared to other weapon systems.
The higher the destructive potential of a weapons system, the harder it must be for militaries to operate them remotely. Nuclear weapons are of course at the top of the list here that must never be remote controlled - but there are plenty of very deadly weapons in our arsenals today and we should try very hard to firewall all of those against remote or autonomous control.
Addressing this issue now is likely to be much easier than tackling it in ten years’ time. Today, the existing arsenals of remote controlled and autonomous systems are very limited and thus the resistance of militaries will be smaller than once large sums will have been invested in stocking those weapons. Acting sooner rather than later could make a crucial difference in preventing the widespread adoption and expansion of these potentially dangerous systems.
Remote-controlled and autonomous weapons are not the only systems that can and should be strengthened today to mitigate adverse outcomes in a potential fast take-off AI future. Making our financial systems and stock markets more robust and hardening important communications infrastructure are also very worthwhile. But if we would pick one of those areas to start working on today, I think it should be strongly limiting remote weapons.
(Note: this post was published also on my Substack)