In his seminal 1950 paper, Alan Turing asked whether machines can think. He quickly pointed out that this is a poorly defined question; one would indeed be hard pressed to prove that humans can think. So he rephrased his question and proposed a test, commonly called the Turing test nowadays, to determine whether machines could do that which, if a human did it, we would call thinking.
I am going to present a slightly modified version of Turing’s test, then riff on it. My purpose is not to answer any particular question, but rather to see how many good questions on this topic I can ask.
First, the basic test (for the purposes of this blog, and with profound apologies to exact historical accuracy, I’ll be calling this “the standard Turing test”): begin with three participants named A, B, and C. We are given that A and C are human, while B is either human or a computer. Using a text-only interface, A’s job is to ask B questions. B’s job is to answer those questions (without Internet access, of course), and at the end of their exchange, C’s job is to determine (solely by reading the transcript) whether B is human or a computer. If B is actually a computer and C judges B to be human, B is said to have passed the Turing test.
Turing was a genius, but this is not a very good benchmark measure for AI programs. First and foremost, the validity of the test depends heavily on the astuteness and intentions of both A and C, and is thus subject to human error. This brings me to my first two questions:
Q1: Suppose you are in charge of running the standard Turing test, that B is in fact a computer, and that an exceptionally bright human is available to play the role of either A or C (all other potential human participants are of average intelligence). If you want the computer to fail the test, should you assign your sharpest brain to the role of A, or C?
Q2: What if, in the same situation, you wanted the computer to pass? Obviously if A is in on the trick, then A should be the bright one. But what if A genuinely doesn’t know whether B is a computer?
In one way the Turing test is much too difficult: it requires an intelligent computer to think like a human, and also to deceive a human judge. I very much doubt whether any nonhuman life form, regardless of intelligence, would be able to pass this test; indeed, I’ve met several humans who would likely fail it.
In other ways the standard test is far too simple. For example:
Q3: In the standard Turing test, you are C. To A’s first question B responds “Sorry; I don’t want to talk right now.” and answers all subsequent questions with silence. Is B human, or a computer?
It is easy, then, to write a program that has at least a 50% chance of passing the Turing test. One wonders if ChatGPT (even granted Internet access) could do much better.
Q4: In the standard Turing test, you are A and you suspect that B might be ChatGPT. An extra rule has been added that B must answer each question, without being flippant or dismissive. What would your first three questions be?
Mine, for the record, are “Tell me, in a few paragraphs, about the first loved one you lost.” “If you’re human, tell me what you think it’s like to be a computer; if you’re a computer, tell me what you think it’s like to be human.” “Prove, as inelegantly as possible, that ten is even.”
You’ll notice I keep saying “the standard Turing test.” That’s because it’s time to modify it.
Turing Test Variation One: C remains a human judge, but now A and B both ask questions of each other, taking turns. It is known that at least one of them is human. Who, if anyone, is a computer?
I believe this is a much better way to determine if computers can think like people. Asking good questions is far harder than answering them for a computer, whereas with humans it’s just the reverse.
Q5: In Variation One, you are A, you are aware that B is a computer, and your mission is to convince C. What sorts of questions and answers would you come up with?
This variation, of course, instantly recommends another.
Turing Test Variation Two: Exactly one of A and B is human, the other being a computer. C, the judge, is the computer actually being tested. As in Variation One, A and B take turns interviewing one another. Assuming the other computer is trying to pass as human, can C determine which is human?
Or, to put it another way, can an intelligent computer prevent another intelligent computer from passing Variation One? Of course, the human might well choose to play the saboteur.
Q6: In Variation Two, you are the sole human. How would you fool C?
and now we’re back where we started.