Artificial intelligence (AI) is not a new technology. For decades, computer scientists have tried different approaches to reach the holy grail of computing: intelligent machines. While we are still far away from replicating the wonders of the human brain, AI applications have started to fill our daily lives and power our electronic devices, from smartphones to home alarm systems. Why this seemingly sudden explosion? This post will answer this question by teaching you about modern AI—including the core principles behind it, and how and why we got to where we are now.
As humans, we’ve always tried to find ways to understand the world around us and bend nature to meet our goals. To do so, we have always relied on external tools that amplify our brain’s capabilities.
The abacus was probably the first such tool, invented about 5,000 to 6,000 years ago to help people make calculations. Although it’s still used in schools to help children visualize simple mathematical operations, it doesn’t really save us from the labor of actually performing them. We had to wait until the 1960s for the first machines that could add and subtract numbers automatically. Computers have come a long way since then, but deep down their capability has still been pretty simple: executing calculations exactly as some (expert) human has instructed them to do. There’s little “intelligence” in them.
The two words artificial and intelligence were first put together on August 31, 1955, when professor John McCarthy from Dartmouth College, together with M.L Minsky
from Harvard University, N. Rochester from IBM, and C. E. Shannon from Bell Telephone Laboratories, asked the Rockefeller Foundation to fund a summer of research
on artificial intelligence. Their proposal stated the following:
We propose that a 2 month, 10 man study of artificial intelligence be carried out during the summer of 1956 at Dartmouth College in Hanover, New Hampshire. . . . An
attempt will be made to find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve
themselves. We think that a significant advance can be made in one or more of these problems if a carefully selected group of scientists work on it together for a summer.
The researchers knew that tackling intelligence as a whole was too tough of a challenge, both because of technical limitations and the inherent complexity of the task. Instead of solving the broad concept of intelligence, they decided to focus on subproblems, like language. Later, these applications would be called narrow AI. An artificial intelligence capable of matching or surpassing human capabilities would instead be called general AI. In other words:
- General AI (or strong AI)—An artificial intelligence program capable of tackling every kind of task it’s presented. This is similar to an extremely resourceful human, and you can think of it as the robot from The Terminator (or, hopefully, a more peaceful version of it).
- Narrow AI—An artificial intelligence program capable of solving a single, well-defined task. It can be broad (recognizing objects from pictures) or extremely specific (predicting which customers who bought product A are more likely to purchase product B as well). This means one task at a time, and not any other: an AI that recognizes cats in images can’t translate English to Italian, and vice versa.
General AI is still far away: researchers still don’t know when we’ll finally get it. Some argue that we’ll never get there. Even though general AI is still a distant, fuzzy dream, this is what many people have in mind when AI is mentioned in the news. If you were one of those people, and are now disappointed that general AI is not here yet, don’t despair. Narrow AI applications are still capable of creating immense value. For example, AI that can detect lung cancer is a narrow application but nevertheless extremely useful.
The results of the Dartmouth research summer of 1956 were so interesting that they sparked a wave of excitement and hope among the participants. The enthusiasm of the scientists spread to the US government, which started heavily funding research on a specific application: English/Russian translation. Finding trustworthy Russian translators must not have been easy in the midst of the Cold War.
After the first few years of work, a government committee produced the infamous 1966 Automatic Language Processing Advisory Committee (ALPAC) report. The document featured the opinions of many researchers about the state of AI research. Most were not very positive. The ALPAC report marks the beginning of a period called the first AI winter: public funding for AI research stopped, excitement cooled, and researchers focused their work on other fields.
Interest in AI faded until the 1980s, when private companies such as IBM and Xerox started investing in a new AI spring. New hopes were fueled by a technology called expert systems: computer programs that encode the knowledge of a human expert in a certain field in the form of precise, if-then rules. An example will help you understand how expert systems were designed to work.
Suppose you want to build an AI system that can stand in for a gastroenterologist. This is how you do it with an expert system: you ask a doctor to describe with extreme precision how they make decisions about patients. You then ask a programmer to painstakingly transform the doctor’s knowledge and diagnosis flow to if-then rules that can be understood and executed by a computer. An extremely simplified version would look something like this:
If the patient has a stomachache and the body temperature is high, then the patient has the flu.
If the patient has a stomachache and has eaten expired food, then the patient has food poisoning.
And so on. Once the doctor’s knowledge is encoded into the software and a patient comes in, the software follows the same decision path as the doctor and (hopefully) comes up with the same diagnosis. This approach has several problems:
- Poor adaptability—The only way for the software to improve is to go back to the drawing board with a computer scientist and the expert (in this case, the doctor).
- Extreme brittleness—The system will fail in situations that weren’t part of the original design. What if a patient has a stomachache but normal body temperature, and hasn’t eaten spoiled food?
- Tough to maintain—The complexity of such a system is huge. When thousands of rules are put together, improving it or changing it is incredibly complicated, slow, and expensive. Have you ever worked with a huge Microsoft Excel sheet and struggled to find the root cause of a mistake? Imagine an Excel sheet 100
Expert systems were a commercial failure. By the end of the 1980s, many of the companies that were developing them went out of business, marking the beginning of the second AI winter. It wasn’t until the early 2000s that the next generation of AI successes came along, fueled by an old idea that became new again: machine learning.
Read more in this next post about machine learning.