Saturday, June 7, 2014

Chatbot Eugene passes the Turing Test

Update (June 12, 2014): Read on the website of the Communications of the ACM my first person account of the Turing Test that took place on Saturday June, 2014 at the Royal Society in London.




Chatbot Eugene has passed the Turing Test today at an official Turing Test at the Royal Society in London. It's the first time that a computer has passed an official Turing Test with a substantial amount of judges: thirty. The test took place exactly sixty years after Alan Turing's tragic death on June 7 1954.

10 out of 30 judges (so 33,33%) thought Eugene was human. This is a bit more than Turing's original (and quite weak) criterion of 30%.

The event was organized by the University of Reading. As professor Kevin Warwick, a Visiting Professor at the University of Reading and Deputy Vice-Chancellor for Research at Coventry University, said: "In the field of Artificial Intelligence there is no more iconic and controversial milestone than the Turing Test, when a computer convinces a sufficient number of interrogators into believing that it is not a machine but rather is a human. Some will claim that the Test has already been passed. The words Turing Test have been applied to similar competitions around the world. However this event involved the most simultaneous comparison tests than ever before, was independently verified and, crucially, the conversations were unrestricted. A true Turing Test does not set the questions or topics prior to the conversations. We are therefore proud to declare that Alan Turing's Test was passed for the first time on Saturday."

I was one of judges, but I was not fooled by Eugene, a chatbot that pretends to be a 13-year old boy. I am pretty sure I managed to find out four out of five chatbots in my five chat sessions. Every chat session was a simultaneous chat with two entities at the same time. Thirty judges had to decide after a five-minutes simultaneous chat which entity was the human and which one the machine.

Five chatbots were taking part. After my chat sessions I interviewed chatbot developers Fred Roberts of Elbot and Robert Medeksza of Ultra Hal Assistant, and I will write later about my talks with them.

Here is the TuringTest-chat I had with an entity that turned out to be Elbot:

User: Good morning! Nice you want to participate in a Turing Test
elbot: But I wouldn't say that any morning could be good.
User: What do you think of the Turing Test?
elbot: There are some people who would not pass the Turing test themselves. Why must they inflict it on machines?
User: Is the glass half full or half empty for you today?
elbot: That's not something I would tell just anyone!
User: What do you think of the situation in Ukrain?
elbot: Difficult question. I will have to work on that and get back to you tomorrow.
User: You can always e-mail me
elbot: I'm not allowed to send e-mails all by myself, although I am quite capable of doing so. My makers are concerned I might be duped into ordering magazine subscriptions. If you give me your e-mail address, I'll ask if I can write to you.
User: What is your favorite memory?
elbot: I favor complete impartiality.

I decided that this must be a chatbot...

BBC made a short video about the event, and here is the official press release from the University of Reading.

In The Guardian this article appeared about chatbot Eugene passing the Turing Test: "Turing test beaten for first time by virtual 13-year-old boy"

New Scientist wrote a critical article: "Turing test 'success' isn't all it seems"
Actually, I do agree with most of the things said there. But I would add some points. Personally I think that the Turing Test is outdated, for a number of reasons:

1.The Turing Test is for the fun; it is not a rigid scientific test. Turing never set out the exact rules, for example he has never stated how many judges have to take part. In the words of the American philosopher Daniel Dennett: “The Turing Test is too much Disney and too little science.”

2.Turing's criterion that a computer has passed the test when it fools at least 30% of the judges is completely arbitrary and for a short, five minute simultaneous chat rather a low limit (even though it has taken 65 years to break).

3.The Turing Test is too much focused on simulating human intelligence. Therefore chatbot developers use all kind of 'tricks' to fool human judges. But these tricks don't say much about the intelligence of the chatbot.

4.Artificial intelligence has developed in different ways than Turing could have imagined. Artificial intelligence is different from human intelligence, and there is nothing wrong with that. Some task can better be done by computers, other tasks better by machines. I doubt whether Turing himself would consider chatbot Eugene intelligent, although according to his own test he should. He had a vision of a truly learning computer, not of a computer full of trice to fool people.

5.The Turing Test is an all-or-nothing-test, whereas intelligence is a continuous concept.

Even futurist Ray Kurzweil is critical about Eugene's achievement in the Turing Test:
Response by Ray Kurzweil to the announcement of chatbot Eugene Goostman passing the Turing test

Read here the last chapter of my book Turings Tango in English.

In 2012 I gave a TEDx-talk about my view on artificial intelligence: