If a machine or AI program exceeds or matches human intelligence, does that mean it can perfectly simulate humans? If so, what about thinking – our ability to apply logic and think rationally before making decisions? How can we determine if an AI program can think? To attempt to answer this question, a team of researchers proposed a new framework that operates like a psychological study of software.
New Testing Framework
This test treats a “smart” program as a participant in a psychological study and consists of three steps: (a) testing the program in a set of experiments that examine its reasoning, (b) testing its understanding of its own thinking process, and (c) examining the appropriateness of the program’s cognitive knowledge if possible,” the researchers state.
What’s Wrong with the Turing Test?
During the Turing Test, evaluators play different games involving text-based communication with real humans and AI programs (machines or chatbots). It’s a blind test, so the evaluators do not know whether they are communicating with a human or a chatbot. If AI programs successfully generate human-like responses – to the extent that evaluators struggle to distinguish between a human and an AI program – it is considered that the AI has passed the test. However, since the Turing Test relies on subjective interpretation, these results are also personal.
Limitations of the Turing Test
The researchers point out that there are many limitations associated with the Turing Test. For example, any of the games played during the test are traditional games designed to test whether the machine can imitate humans or not. Evaluators make decisions based solely on the language or tone of the messages they receive. ChatGPT is characterized by its ability to imitate human language, even in responses where it provides incorrect information. Thus, the test does not assess the machine’s ability to think and its logical capacity.
The results of the Turing Test also cannot tell you whether the machine is capable of internal thinking. We often think about our past actions and reflect on our lives and decisions, a crucial ability that prevents us from repeating the same mistakes. The same applies to AI as well, according to a study from Stanford University, which indicates that machines capable of self-reflection are more effective for human use.
“AI agents that can leverage past experience and adapt well through exploring new or changing environments will lead to more adaptive and flexible technologies, from home robots to personalized learning tools,” said Nick Haber, an assistant professor at Stanford University who did not participate in the current study.
Additionally, the Turing Test fails to analyze the AI program’s ability to think. In a recent Turing Test, GPT-4 managed to convince evaluators that they were communicating with humans over 40 percent of the time. However, this result does not answer the fundamental question: Can an AI program think?
Alan Turing, the famous British scientist who created the Turing Test, said, “A computer would deserve to be called intelligent if it could deceive a human into believing that it was human.” His test covers only one aspect of human intelligence, which is imitation. While it is possible to deceive someone using this single aspect, many experts believe that the machine will never achieve true human intelligence without incorporating those other aspects.
“From
It is unclear whether passing the Turing Test is a meaningful achievement or not. It does not tell us anything about the system’s ability to perform certain tasks or its understanding of anything, or whether it has created a complex internal dialogue, or can engage in abstract long-term planning, which is fundamental to human intelligence,” said Mustafa Suleyman, an AI expert and founder of DeepAI, to Bloomberg.
Leave a Reply