On August 23rd, ExtraHop hosted a champagne tasting for an exclusive group of about 50 cybersecurity executives. The virtual event featured special guest Jonathan Zittrain, co-founder and faculty director of Harvard’s Berkman Klein Center for Internet & Society. Zittrain, a world-renowned expert on digital technology and policy, gave a talk on the state of artificial intelligence (AI) that attendees said succeeded in demystifying much of this vexing technology.
“This has been the best presentation on AI that I’ve seen,” one attendee wrote in the chat.
“Wonderful event. Thanks so much for sharing these insights. Very valuable,” wrote another.
Read on for a recap of Zittrain’s presentation.
1966: The Birth of Generative AI
One of the longest-standing problems in AI research has been figuring out how to have a conversation with a computer that feels human and far ranging. Zittrain traced the history of this problem and its influence on AI chatbots and generative AI back to 1966, when Joseph Weizenbaum, a computer scientist and professor at MIT, created ELIZA, “a program which makes natural language conversation with a computer possible.” Weizenbaum named the program after Eliza Doolittle, a character from George Bernard Shaw’s play “Pygmalion” (and its 1964 screen adaption, “My Fair Lady”) to emphasize that its language abilities could be improved with the help of a teacher.
Zittrain characterized ELIZA as a rudimentary version of what’s known as an expert system. In this form of AI, end users interact with a knowledge base curated from experts. ELIZA was meant to act as a psychotherapist, which was a helpful conceit to hide the fact that its conversational abilities were limited. Weizenbaum admitted that ELIZA’s conversation skills were little more than a parlor trick: it would simply mirror what a user said to it. “I’m” becomes “you’re” in a reply and vice versa in an imitation of the way a therapist might speak. Despite its limitations, users were more than happy to go along, which worried Weizenbaum, who issued a prescient warning that’s as appropriate for today’s chatbots as it was for ELIZA in the mid-1960s: “ELIZA shows, if nothing else, how easy it is to create and maintain the illusion of understanding, hence perhaps judgment deserving of credibility. A certain danger lurks there.”
How Does Machine Learning Work?
Machine learning (ML) has its roots in neural networks, the building blocks of which have been in place since the 1940s. ML and neural networks have stepped into the limelight in the last decade as improvements in computing power and training data have enabled them to become dramatically more impressive.
Zittrain explained there are at least two methods by which neural networks learn: unsupervised and supervised learning. Figuratively speaking, he compared unsupervised learning to pouring a data set into a bucket, gently shaking it around, and seeing if anything useful appears. Oftentimes, Zittrain noted, nothing of value appears, but sometimes the machine groups the data in a way that makes intuitive sense to humans. The machine doesn’t know anything about the labels and reality behind the data; it’s simply offering a possible grouping. It’s up to us to make inferences and find meaning in the grouping. In a way, Zittrain added, unsupervised learning is simply applied statistics or vectorized math. This method can produce surprisingly powerful results, like predicting songs you’d like based on previous listening habits.
In contrast to unsupervised learning, supervised learning starts with being provided labeled data sets—”ground truth.” A machine designed to identify frogs would be trained on at least two data sets, one with images labeled “frogs” and another with images of other creatures labeled “not frogs.” Given these ground truths, ideally the model should be able to classify images from outside the training data as frogs or not frogs.
Zittrain explained that the architecture of these kinds of AI are based on a rudimentary conception of the connections between neurons, which is why they’re called neural networks. In a “frog-not frog” machine, each pixel in an image is fed to a “neuron” in the input layer. Depending on the output, nodes in the next layer either fire or don’t. This cascades through more layers until the output layer is reached. Here, all the outputs are summed to generate a number between 0 and 1. One means the image is of a frog. A zero means it isn’t a frog. And something in between indicates something frog-like.
These supervised models start out untrained, so the first results from the frog-not frog machine will be random. But through various training techniques, said Zittrain, we can start to get more accurate answers by telling the model whether it was right or not.
Strangely, Zittrain noted, if you look closely at the successfully trained model, you won’t easily find a parameter for “buggy eyes” or “green.” This is known as the problem of interpretability. It’s difficult for a human to look inside a model and see how it “reasons” whether something is a frog or not. Essentially, we know it works, but it’s not clear exactly how.
Surprising Successes and Shortcomings of Large Language Models
Large language models (LLMs) are the most popular current iteration of ML. They’re a type of AI known as generative AI that creates something new based on inputs and a corpus of training data. Generally, LLMs start out with unsupervised learning, and are then refined through supervised or “reinforcement” learning, according to Zittrain.
ChatGPT is probably the most well known LLM, though it’s far from the first. Zittrain noted that Microsoft released a short-lived chatbot named Tay to Twitter in 2016. Similar to ELIZA’s therapist conceit, Tay was intended to imitate “teenage web speak,” according to Zittrain. He said it’s not clear how Tay was trained, but it seems that it may have taken human responses to Twitter posts as examples of “correct” responses. Unfortunately, within 24 hours of being introduced, Tay began reflecting the worst of Twitter by responding to human users in highly inappropriate and offensive ways, leading Microsoft to take it down.
By 2018, advances in unsupervised learning led to auto-completion in texts and emails. And a new Microsoft chatbot was able to confidently answer questions like “who is Adele?” with “she’s a singer,” or “I don’t know who Adele is.”
ChatGPT is based on OpenAI’s GPT3.5. The model has, by many metrics, improved vastly over previous versions but, Zittrain remarked, it’s important to note how GPT2 was originally described. OpenAI billed it as “a large-scale unsupervised language model which generates coherent paragraphs of text” [emphasis added]. Zittrain pointed out that OpenAI didn’t describe GPT2 as generating “accurate” or “truthful” paragraphs of text; the company specifically used the word “coherent.”
Though the goal was only coherence, GPT2 grew to show elements of cognition in its later iterations. GPT3.5 could answer riddles coherently but often incorrectly. Now GPT4 appears to actually give the right answer to some riddles that might require logical processing. Zittrain and colleagues, experimenting with an early version of GPT3, found out that GPT could be prompted to write a speech for a figure of state to deliver in the event the President, say, gets eaten by a snake, perfectly in the style of presidential speechwriters. Normally a skeptic, Zittrain found these capabilities jaw-dropping. But on the other hand, if you ask how many “n’s” are in “mayonnaise,” it might tell you only one. Ultimately, “hallucinations” like this are innate to LLMs because their design is for coherence, not truth or an accurate representation of knowledge, he explained. He added that LLM hallucinations may prevent generative AI tools from becoming the new search engines. At present, without (and sometimes even with) precise prompting, it seems these tools are just as likely to mislead you as they are to tell you the truth, he said.
AI Risks and Recommendations
Like the frog-not frog supervised machine learning data set, LLMs also suffer from the problem of interpretability, according to Zittrain. We don’t really know well how they generate outputs, which makes it impossible to predict what they’ll say next. This has led some researchers to compare LLMs to the Shoggoth, an octopus-like being from the novels of H.P. Lovecraft. AI researchers working on LLMs have attempted to address this through a type of supervised learning referred to as reinforcement learning through human feedback. Experts interact with the model and train it not to give inaccurate or otherwise undesirable answers. Critics say this is just putting a smiley face mask on the Shoggoth, and Zittrain pointed out that it’s worth considering the power in the hands of whoever designs that mask.
LLMs are also vulnerable to what are known as adversarial perturbations. ML researchers discovered these seemingly random tokens that a user can append to a prompt to unlock forbidden responses. It’s unclear how or why these work, though they predate LLMs and exist across the realm of ML.
To mitigate the risks and get the most out of LLMs and other AI technologies, Zittrain offered the following recommendations:
- Treat it like a friend. If you want to get the best responses from an LLM, it pays to think carefully about your prompts and instruct it as explicitly as possible. Many of us are used to “search engine shorthand,” said Zittrain, but when interacting with LLMs, the more you treat it like a friend, the better its answers will be. As models improve, their context windows grow, allowing them to remember more of the previous prompts and enabling you to iterate with them until you get the best answer. Asking LLMs to show their work or go step by step produces more accurate results. And in one fascinating example, telling a model that Turing Award winner Yann LeCun was skeptical it could solve a problem led to it finally getting the right answer. “The more you treat it like a friend, whether or not it’s merely a ‘stochastic parrot,’ in the memorable words of some prominent AI researchers, the more it apparently will deliver to you,” he said.
- Experiment early and often. Encourage employees to experiment with these technologies and to share their experiments, results, and findings, including with independent researchers. You need to know where, when, and how these experiments are happening. Do evaluations and perform red-teaming in a structured way so you can catch any problems early on and avoid a public Microsoft Tay moment.
- Keep a ledger. AI is like asbestos, said Zittrain. It can be found in all sorts of places, and you never know where it’s going to show up. Similarly, it also seems to be very good at its intended purpose, but problems may not manifest until later, when it’s difficult to change things. So, like asbestos, it’s best to keep records of where you’re using AI so if you have to peel any of it back, you know where to look.
- Be cautious in your implementation. Once you’ve opened the AI Pandora’s box, keep the lid handy. When possible, implement AI tools in such a way that you can pull back if things aren’t working as expected.
Zittrain concluded with a quote from renowned science fiction author Arthur C. Clarke. In his 1962 book, “Profiles of the Future: An Inquiry into the Limits of the Possible,” Clarke put it succinctly: “any sufficiently advanced technology is indistinguishable from magic.” Like a fine champagne, AI certainly feels like magic at times, but it’s important to not let it bewitch you.
This article was not written with AI.