Taken from theguardian.com | Author: Alex Hern | Date: 15 March 2023
The most powerful AI model yet from OpenAI, can tell jokes and write bar exams – but can it also cause harm?
OpenAI’s latest release, GPT-4, is the most powerful and impressive AI model yet from the company behind ChatGPT and the Dall-E AI artist. The system can pass the bar exam, solve logic puzzles, and even give you a recipe to use up leftovers based on a photo of your fridge – but its creators warn it can also spread fake facts, embed dangerous ideologies, and even trick people into doing tasks on its behalf. Here’s what you need to know about our latest AI overlord.
What is GPT-4?
GPT-4 is, at heart, a machine for creating text. But it is a very good one, and to be very good at creating text turns out to be practically similar to being very good at understanding and reasoning about the world.
And so if you give GPT-4 a question from a US bar exam, it will write an essay that demonstrates legal knowledge; if you give it a medicinal molecule and ask for variations, it will seem to apply biochemical expertise; and if you ask it to tell you a gag about a fish, it will seem to have a sense of humour – or at least a good memory for bad cracker jokes (“what do you get when you cross a fish and an elephant? Swimming trunks!”).
Is it the same as ChatGPT?
Not quite. If ChatGPT is the car, then GPT-4 is the engine: a powerful general technology that can be shaped down to a number of different uses. You may already have experienced it, because it’s been powering Microsoft’s Bing Chat – the one that went a bit mad and threatened to destroy people – for the last five weeks.
But GPT-4 can be used to power more than chatbots. Duolingo has built a version of it into its language learning app that can explain where learners went wrong, rather than simply telling them the correct thing to say; Stripe is using the tool to monitor its chatroom for scammers; and assistive technology company Be My Eyes is using a new feature, image input, to build a tool that can describe the world for a blind person and answer follow-up questions about it.
What makes GPT-4 better than the old version?
On a swathe of technical challenges, GPT-4 performs better that its older siblings. It can answer maths questions better, is tricked into giving false answers less frequently, can score fairly highly on standardised tests – though not those on English literature, where it sits comfortably in the bottom half of the league table – and so on.
It also has a sense of ethics more firmly built into the system than the old version: ChatGPT took its original engine, GPT-3.5, and added filters on top to try to prevent it from giving answers to malicious or harmful questions. Now, those filters are built straight into GPT-4, meaning that the system will politely decline to perform tasks such as ranking races by attractiveness, telling sexist jokes, or providing guidelines for synthesising sarin.
So GPT-4 can’t cause harm?
OpenAI has certainly tried to achieve that. The company has released a long paper of examples of harms that GPT-3 could cause that GPT-4 has defences against. It even gave an early version of the system to third party researchers at the Alignment Research Center, who tried to see whether they could get GPT-4 to play the part of an evil AI from the movies.
It failed at most of those tasks: it was unable to describe how it would replicate itself, acquire more computing resources, or carry out a phishing attack. But the researchers did manage to simulate it using Taskrabbit to persuade a human worker to pass an “are you human” test, with the AI system even working out that it should lie to the worker and say it was a blind person who can’t see the images. (It is unclear whether the experiment involved a real Taskrabbit worker).
But some worry that the better you teach an AI system the rules, the better you teach that same system how to break them. Dubbed the “Waluigi effect”, it seems to be the outcome of the fact that while understanding the full details of what constitutes ethical action is hard and complex, the answer to “should I be ethical?” is a much simpler yes or no question. Trick the system into deciding not to be ethical and it will merrily do anything asked of it.