From Turing to Watson (via Minsky)

This week (Monday 25th) I gave a lecture about IBM’s Watson technology platform to a group of first year students at Warwick Business School. My plan was to write up the transcript of that lecture, with links for references and further study, as a blog post. The following day when I opened up my computer to start writing the post I saw that, by a sad coincidence, Marvin Minsky the American cognitive scientist and co-founder of the Massachusetts Institute of Technology’s AI laboratory had died only the day before my lecture. Here is that blog post, now updated with some references to Minsky and his pioneering work on machine intelligence.

Marvin Minsky in a lab at MIT in 1968 (c) MIT

First though, let’s start with Alan Turing, sometimes referred to as “the founder of computer science”, who led the team that developed a programmable machine to break the Nazi’s Enigma code, which was used to encrypt messages sent between units on the battlefield during World War 2. The work of Turing and his team was recently brought to life in the film The Imitation Game starring Benedict Cumberbatch as Turing and Keira Knightley as Joan Clarke, the only female member of the code breaking team.

Alan Turing

Sadly, instead of being hailed a hero, Turing was persecuted for his homosexuality and committed suicide in 1954 having undergone a course of hormonal treatment to reduce his libido rather than serve a term in prison. It seems utterly barbaric and unforgivable that such an action could have been brought against someone who did so much to affect the outcome of WWII. It took nearly 60 years for his conviction to be overturned when on 24 December 2013, Queen Elizabeth II signed a pardon for Turing, with immediate effect.

In 1949 Turing became Deputy Director of the Computing Laboratory at Manchester University, working on software for one of the earliest computers. During this time he worked in the emerging field of artificial intelligence and proposed an experiment which became known as the Turing test having observed that: “a computer would deserve to be called intelligent if it could deceive a human into believing that it was human.”

The idea of the test was that a computer could be said to “think” if a human interrogator could not tell it apart, through conversation, from a human being.

Turing’s test was supposedly ‘passed’ in June 2014 when a computer called Eugene fooled several of its interrogators that it was a 13 year old boy. There has been much discussion since as to whether this was a valid run of the test and that the so called “supercomputer,” was nothing but a chatbot or a script made to mimic human conversation. In other words Eugene could in no way considered to be intelligent. Certainly not in the sense that Professor Marvin Minsky would have defined intelligence at any rate.

In the early 1970s Minsky, working with the computer scientist and educator Seymour Papert, wrote a book called The Society of Mind, which combined both of their insights from the fields of child psychology and artificial intelligence.

Minsky and Papert believed that there was no real difference between humans and machines. Humans, they maintained, are actually machines of a kind whose brains are made up of many semiautonomous but unintelligent “agents.” Their theory revolutionized thinking about how the brain works and how people learn.

Despite the more widespread accessibility to apparently intelligent machines with programs like Apple Siri Minsky maintained that there had been “very little growth in artificial intelligence” in the past decade, saying that current work had been “mostly attempting to improve systems that aren’t very good and haven’t improved much in two decades”.

Minsky also thought that large technology companies should not get involved the field of AI saying: “we have to get rid of the big companies and go back to giving support to individuals who have new ideas because attempting to commercialise existing things hasn’t worked very well,”

Whilst much of the early work researching AI certainly came out of organisations like Minsky’s AI lab at MIT it seems slightly disingenuous to believe that commercialistion of AI, as being carried out by companies like Google, Facebook and IBM, is not going to generate new ideas. The drive for commercialisation (and profit), just like war in Turing’s time, is after all one of the ways, at least in the capitalist world, that innovation is created.

Which brings me nicely to Watson.

IBM Watson is a technology platform that uses natural language processing and machine learning to reveal insights from large amounts of unstructured data. It is named after Thomas J. Watson, the first CEO of IBM, who led the company from 1914 – 1956.

Thomas J. Watson

IBM Watson was originally built to compete on the US television program Jeopardy.  On 14th February 2011 IBM entered Watson onto a special 3 day version of the program where the computer was pitted against two of the show’s all-time champions. Watson won by a significant margin. So what is the significance of a machine winning a game show and why is this a “game changing” event in more than the literal sense of the term?

Today we’re in the midst of an information revolution. Not only is the volume of data and information we’re producing dramatically outpacing our ability to make use of it but the sources and types of data that inform the work we do and the decisions we make are broader and more diverse than ever before. Although businesses are implementing more and more data driven projects using advanced analytics tools they’re still only reaching 12% of the data they have, leaving 88% of it to go to waste. That’s because this 88% of data is “invisible” to computers. It’s the type of data that is encoded in language and unstructured information, in the form of text, that is books, emails, journals, blogs, articles, tweets, as well as images, sound and video. If we are to avoid such a “data waste” we need better ways to make use of that data and generate “new knowledge” around it. We need, in other words, to be able to discover new connections, patterns, and insights in order to draw new conclusions and make decisions with more confidence and speed than ever before.

For several decades we’ve been digitizing the world; building networks to connect the world around us. Today those networks connect not just traditional structured data sources but also unstructured data from social networks and increasingly Internet of Things (IoT) data from sensors and other intelligent devices.

Data to Knowledge
From Data to Knowledge

These additional sources of data mean that we’ve reached an inflection point in which the sheer volume of information generated is so vast; we no longer have the ability to use it productively. The purpose of cognitive systems like IBM Watson is to process the vast amounts of information that is stored in both structured and unstructured formats to help turn it into useful knowledge.

There are three capabilities that differentiate cognitive systems from traditional programmed computing systems.

  • Understanding: Cognitive systems understand like humans do, whether that’s through natural language or the written word; vocal or visual.
  • Reasoning: They can not only understand information but also the underlying ideas and concepts. This reasoning ability can become more advanced over time. It’s the difference between the reasoning strategies we used as children to solve mathematical problems, and then the strategies we developed when we got into advanced math like geometry, algebra and calculus.
  • Learning: They never stop learning. As a technology, this means the system actually gets more valuable with time. They develop “expertise”. Think about what it means to be an expert- – it’s not about executing a mathematical model. We don’t consider our doctors to be experts in their fields because they answer every question correctly. We expect them to be able to reason and be transparent about their reasoning, and expose the rationale for why they came to a conclusion.

The idea of cognitive systems like IBM Watson is not to pit man against machine but rather to have both reasoning together. Humans and machines have unique characteristics and we should not be looking for one to supplant the other but for them to complement each other. Working together with systems like IBM Watson, we can achieve the kinds of outcomes that would never have been possible otherwise:

IBM is making the capabilities of Watson available as a set of cognitive building blocks delivered as APIs on its cloud-based, open platform Bluemix. This means you can build cognition into your digital applications, products, and operations, using any one or combination of a number of available APIs. Each API is capable of performing a different task, and in combination, they can be adapted to solve any number of business problems or create deeply engaging experiences.

So what Watson APIs are available? Currently there are around forty which you can find here together with documentation and demos. Four examples of the Watson APIs you will find at this link are:

Watson API - Dialog



Use natural language to automatically respond to user questions



Watson API - Visual Recognition


Visual Recognition

Analyses the contents of an image or video and classifies by category.



Watson API - Text to Speech


Text to Speech

Synthesize speech audio from an input of plain text.



Watson API - Personality Insights


Personality Insights

Understand someones personality from what they have written.



It’s never been easier to get started with AI by using these cognitive building blocks. I wonder what Turing would have made of this technology and how soon someone will be able to pin together current and future cognitive building blocks to really pass Turing’s famous test?