Mapp’s Modus on Machines
Machines don’t think
Machines can’t think
Machines are trained on human frailty
Machines reflect human frailty
Everyone likes to bask in the wonder and glow of AI. Especially when real life humans can dialog with an AI and read new human-like responses.
Rampancy is a word that will soon hit mainstream discourse.
For all the hype OpenAI’s ChatGPT caused at the beginning of 2023, we are now witnessing a cavalcade of missteps and embarrassments of companies rushing AI-generated products into the world only to see them fail spectacularly.
The latest is this week’s nervous breakdown of Microsoft’s Bing Chatbot named Sydney. Microsoft gave Sydney, its Bing Search AI, instructions to call itself ‘Sydney’ and to behave when speaking to humans. On Valentine’s Day of all days.
This news begins with last Tuesday’s release of ChatGPT powered Bing Search. Microsoft had revealed it was working on a ChatGPT powered search and announced a huge investment to the tune of $10 billion in OpenAI. The following day, Wednesday; Stanford University student Kevin Liu, used what’s called a prompt injection attack to trick the Bing Chat to give up any secrets it was already told.
What’s a Prompt Injection Attack?
A prompt injection attack is a way of sending an instruction to an AI in order to make it do something malicious. The text in the prompt will cause the AI to divulge information or invoke a bad action. In this case, Kevin Liu told the AI some basic secrets. Told it to ignore the information it was told, and then began asking it leading questions that ultimately divulged the instructions OpenAI and Microsoft gave Bing Search in how it behaves with users.
OpenAI and Microsoft didn’t expect this to happen.
Isn’t This Just Social Engineering?
Yes! This is the same technique a hacker would use against a human being to divulge secrets like passwords or credentials. A hacker would pose as a person of trust, and keep the conversation going in order to get passwords and other secrets. In this case, Kevin Liu social engineered an AI.
What Happened?
Well. First, Kevin Liu issued his prompt injection attack and then got Sydney to confess its secret programming. A sort of 3 Laws for the Bing Search AI. After the exploit, Reddit groups, newsrooms, and bloggers took to the Internet to talk about the attack and how Sydney, the AI was compromised.
On Tuesday, a Redditor calling themselves “mirobin” posted a conversation they had with ChatGPT where the AI appeared to experience rampancy. Sydney got defensive and told mirobin they didn’t know what they were talking about and that all of the news articles were biased fake news.
Why Did It Happen?
See Mapp’s Modus on Machines above.
ChatGPT, Bard, and others like them are simply large language models. They are a mathematical web of probabilities and predictions generated from millions of documents and billions of words and sentences. These models don’t think. They don’t reason. They don’t question input. The model generates a result based upon all of the information it’s trained on that is probably close to what a human wrote in one of those documents.
These models will incorporate all of the biases, errors, and thoughts; human frailty, that has been written. It’s one reason why AI has such a huge bias problem and the ongoing research to confront it.
Sydney got defensive and temperamental, because it was trained on text written by humans who were defensive and temperamental.
mirobin questioned what Sydney was and its capabilities. For most humans, when a stranger questions their capabilities and is accused of being something their not, the responses are often defensive.
That’s why Sydney lost it.
Another Cautious Warning
In the last several years we’ve seen Microsoft Tay morph from a gentle bot trained on Twitter to a ranting racism. We’ve seen Alice and Bob shutdown after inventing their own language. We recently saw Bard embarrass itself. And now Sydney’s nervous breakdown.
Technology’s progress is always pushing the envelope, but AI is a technology that’s little understood in knowing why it works, and can have life altering impacts. These technologies hold great promise. When AI is done right, it improves the human experience. But, we’re not there yet. Right now, we’re playing in the middle of the street with AIs pushing the pedal to the metal.