Right now, it’s obvious: AI can’t be trusted. It makes silly mistakes and hallucinates, revealing that it truly has no idea about the world. Those gross outliers are easy to catch. Worse are the more nuanced mistakes or hallucinations that creep into otherwise convincing looking work: those could be code fixes that sound super reasonable but are just wrong, well-written articles that – somewhere in the middle – become obscene, or images which, upon closer inspection, have obvious biological flaws.
But there are also tons of tools and companies out there who claim to be able to get this under control. Can they create AIs that are fundamentally trustworthy? The answer is: yes and no.
Finding our tolerance for mistakes
If we define trustworthy as “never, ever makes a mistake”, then this will never be achieved. But if we define trustworthy as “make less drastic or fewer mistakes than humans” there is a good chance that we will see AI systems that are on par or better than your typical human colleague. For things like monitoring machines and detecting when something is about to go wrong, reading human hand writing, playing chess or go – AI isn’t perfect, but it does beat human performance.
But even if AI systems end up making fewer mistakes – it’s still unlikely that we will trust it as much as we’d trust a human colleague. That’s because it typically doesn’t make the same *type* of mistakes – and that is hard to accept. A system that, according to a human, makes a completely stupid mistake will not be trusted, even if the side effects are, on average, smaller.
Take autonomous cars. They make far fewer mistakes but the mistakes they do make are shocking in their stupidity. “That would never happen with a human” is a killer argument to an otherwise much more reliable system. Maybe we’ll learn how to deal with that in the future and accept that robots make different mistakes than humans – and fewer!
In some cases, however, absolute trust is needed, and we cannot tolerate even the smallest mistakes. When something is truly at stake, human life, an unrecoverable disaster such as a nuclear power plant blowing up, or cases of discrimination. This is also what most government regulations focus on: Which type of application can use an AI? If errors result in a catastrophe: then AI is not allowed (or it can only operate in a smaller, well controlled environment and the true risk is handled elsewhere).
We won’t get perfect AI systems
But why can’t AI systems ever be perfect? Because they literally don’t know what they are talking about. Put differently by Stefan Wrobel recently: GenAI systems produce the likely, not the true. Since they are based on human and incomplete information there is always a chance of something missing or simply not being likely enough.
Yes, we can build tons of safeguards around those AI systems to ensure that some mistakes cannot happen. We can filter for harassment in statements, we can forbid certain words or actions, we can even guarantee that code produced by an AI is syntactically (but not semantically!) correct – but we can never be sure that every possible way of AI going “astray” is covered. In a way this is just like the constant competition between virus detectors and virus creators. Whenever the detectors are getting close to catching up with every possible virus variant, a new one shows up.
So if you are looking for a perfect AI – you won’t get it. The real question is: Do you need it? Or are you just afraid of being faced with mistakes from an AI system that are, well, really hard to swallow because *you* wouldn’t make them. Remember how many “human” mistakes you make that an AI makes at a fraction of the likelihood.