Turing test

The Turing test, originally called the imitation game by Alan Turing in 1949, is a way to see if a machine can act as intelligently as a human. In this test, a person tries to talk to both a human and a machine through text messages and must guess which one they are talking to. If the person cannot tell the difference, the machine has passed the Turing test.

The "standard interpretation" of the Turing test, in which player C, the interrogator, is given the task of trying to determine which player – A or B – is a computer and which is a human. The interrogator is limited to using the responses to written questions to make the determination.

Turing created this idea in his famous 1950 paper, "Computing Machinery and Intelligence," while he worked at the University of Manchester. He started by asking, "Can machines think?" Since it's hard to define what "thinking" really means, he suggested a different question that is easier to understand. He used a party game where someone has to guess whether they are talking to a man or a woman, and then asked if a computer could do well in a similar game.

Since Turing introduced his test, it has had a big impact on the philosophy of artificial intelligence. Many people have talked about and debated the idea. Some philosophers, like John Searle, have argued that the test cannot really show if a machine is conscious. Even with these debates, the Turing test remains an important idea in understanding how we can tell if a machine is acting like a thinking being.

History

Alan Turing in 1951

The idea of whether machines can think goes back a long way, tied to debates about whether the mind is physical or not. Early thinkers like René Descartes wondered if machines could ever truly think like humans. Later, others like Denis Diderot suggested that if a machine could answer any question, it might be considered intelligent.

Alan Turing introduced what we now call the Turing test in 1950. He suggested a game where a person tries to tell whether they are talking to a machine or a human through text messages. If the person cannot reliably tell the difference, the machine has passed the test. This idea sparked many discussions about whether computers could ever truly think like humans. Over the years, people have debated the test’s meaning and value, with some arguing it doesn’t prove a machine is thinking, only that it can mimic conversation well. The test remains a famous way to explore the possibilities and limits of artificial intelligence.

Attempts

Some early computer programs were said to pass the Turing test by using simple tricks, like pretending to have mental illness or not understanding English very well.

In 1966, a program named ELIZA was created to mimic a therapist. It worked by repeating keywords from the user's messages, making some people believe they were talking to a real person. In 2001, a program called Eugene Goostman pretended to be a 13-year-old boy learning English, and 33% of judges thought it was human.

More recently, advanced programs like ChatGPT have come close to passing the Turing test. In March 2024, researchers reported that ChatGPT passed the test by being more cooperative than average human behavior. In a study in March 2025, another program was identified as human 73% of the time.

Large language models

Main article: Large language model

Google LaMDA

Main article: LaMDA

ChatGPT

Main article: ChatGPT

Versions

Alan Turing described a game with three players to explore whether machines could think like humans. In the game, one player tries to trick another into making the wrong decision about who is a man and who is a woman, by only sending written notes. Turing wondered what would happen if a machine took the place of the person trying to trick the interrogator.

Later, Turing suggested a version where a computer plays the tricking role, and a human helps the interrogator. Some people think the real goal of the Turing test is to see if a computer can just act like a human, not necessarily trick someone completely. This idea is called the "standard interpretation," though not everyone agrees it was Turing's original plan. In this version, the interrogator tries to tell whether they are talking to a human or a machine.

Interpretations

People have debated what Alan Turing really meant when he described his test. Some believe Turing had two different versions in mind. One version is like a party game where a judge tries to tell the difference between a person and a computer by looking at conversation transcripts. The other is where a judge chats with both a person and a computer to see which is which.

Others think Turing was mainly interested in whether a machine could think like a person. They suggest his test was about how well a machine could trick a person into believing it was human. Some also argue that Turing's test is less about copying humans and more about showing cleverness and problem-solving skills.

Strengths

The Turing test is popular because it is simple. Unlike fields such as the philosophy of mind, psychology, and neuroscience, which struggle to define intelligence clearly, the Turing test offers a practical way to measure it. It allows the person testing to ask almost any kind of question, making it useful for many areas of study.

To pass the Turing test, a machine must be able to use natural language, reason, have knowledge, and learn. The test can even include tasks that need vision or robotics skills. This makes it a good way to explore many challenges in artificial intelligence research. The test also highlights the importance of emotional and aesthetic intelligence, showing that understanding feelings and beauty may be key to creating safe and friendly AI systems.

Weaknesses

Alan Turing, who created the Turing test, did not say it could measure "intelligence." However, some people have suggested using it that way. This idea has faced criticism. Some worry that the test only shows how easy it is to trick humans, not if a machine is truly intelligent.

The Turing test has several challenges. It tests if a machine can behave like a human, not if it is truly intelligent. For example, it might need to copy human mistakes to pass. Also, the test does not show if a machine could solve hard problems better than humans. Because of these issues, many experts think the Turing test is not the best way to study artificial intelligence. They prefer testing specific skills, like recognizing objects, instead of trying to copy humans exactly.

Variations

Many different versions of the Turing test have been suggested over the years.

One version is called the reverse Turing test, where the computer tries to tell if it is talking to a human or another computer. CAPTCHA tests, where you must read distorted words to prove you are not a robot, are an example of this reverse test.

Another version, called the subject-matter expert Turing test, asks if a machine can talk like an expert in a certain area, such as medicine or science. There are also tests that try to see if a machine can understand language deeply, not just copy words, and tests that include seeing and moving objects, like a robot might need to do.

In 2023, a company called AI21 Labs created a big online game called "Human or Not?" that was played millions of times. The results showed that about 32% of people could not tell if they were talking to a human or a computer.

Alternative tests for machine intelligence

The Lovelace test is named after Ada Lovelace, who believed computers should be trusted with intelligence only when they can create original ideas.

In 2023, David Eagleman suggested that truly intelligent systems should be able to make scientific discoveries. He described two levels: Level 1, where the AI connects existing facts, and Level 2, where the AI develops entirely new ideas and tests them. Other tests for AI intelligence include the Winograd Schema Challenge, which checks understanding of language, the Allen AI Science Challenge for answering science questions, and the Artificial General Intelligence (AGI) Test, which sees if a machine can do any intellectual task a human can.

Conferences

In 1990, to mark the 40th anniversary of Alan Turing's famous paper, the Turing Colloquium was held at the University of Sussex. Scholars and experts from many fields gathered to talk about the Turing test and what it might mean in the future. That same year also saw the start of the Loebner Prize competition.

Later, in 2008, a special meeting was organized by the AISB at the University of Reading alongside the Loebner Prize. Famous thinkers talked about the Turing test, but they did not all agree on exactly what the test should be.