As soon as “too scary” to launch, GPT-2 will get squeezed into an Excel spreadsheet

An illustration of robots sitting on a logical block diagram. — Getty Photos

It looks like AI giant language fashions (LLMs) are all over the place lately because of the rise of ChatGPT. Now, a software program developer named Ishan Anand has managed to cram a precursor to ChatGPT known as GPT-2—initially launched in 2019 after some trepidation from OpenAI—right into a working Microsoft Excel spreadsheet. It is freely accessible and is designed to teach individuals about how LLMs work.

“By utilizing a spreadsheet anybody (even non-developers) can discover and play instantly with how a ‘actual’ transformer works below the hood with minimal abstractions to get in the best way,” writes Anand on the official web site for the sheet, which he calls “Spreadsheets-are-all-you-need.” It is a nod to the 2017 analysis paper “Consideration is All You Want” that first described the Transformer structure that has been foundational to how LLMs work.

Anand packed GPT-2 into an XLSB Microsoft Excel binary file format, and it requires the newest model of Excel to run (however will not work on the internet model). It is fully native and does not do any API calls to cloud AI companies.

Although the spreadsheet accommodates a whole AI language mannequin, you’ll be able to’t chat with it like ChatGPT. As a substitute, customers enter phrases in different cells and see the predictive outcomes displayed in several cells nearly immediately. Recall that language fashions like GPT-2 had been designed to do next-token prediction, which implies they attempt to full an enter (known as a immediate, which is encoded into chunks known as tokens) with the almost certainly textual content. The prediction might be the continuation of a sentence or some other text-based process, akin to software program code. Totally different sheets in Anand’s Excel file enable customers to get a way of what’s going on below the hood whereas these predictions are happening.

Spreadsheets-are-all-you-need solely helps 10 tokens of enter. That is tiny in comparison with the 128,000-token context window of GPT-4 Turbo, however it’s sufficient to exhibit some fundamental ideas of how LLMs work, which Anand has detailed in a sequence of free tutorial movies he has uploaded to YouTube.

A video of Iman Anand demonstrating “Spreadsheets-are-all-you-need” in a YouTube tutorial.

In an interview with Ars Technica, Anand says he began the challenge so he may fulfill his personal curiosity and perceive the Transformer intimately. “Fashionable AI is so completely different from the AI I realized once I was getting my CS diploma that I felt I wanted to return to the basics to really have a psychological mannequin for the way it labored.”

He says he was initially going to re-create GPT-2 in JavaScript, however he loves spreadsheets—he calls himself “a spreadsheet addict.” He pulled inspiration from information scientist Jeremy Howard’s quick.ai and former OpenAI engineer Andrej Karpathy’s AI tutorials on YouTube.

“I walked away from Karpathy’s movies realizing GPT is generally only a huge computational graph (like a spreadsheet),” he says, “And [I] liked how Jeremy usually makes use of spreadsheets in his course to make the fabric extra approachable. After watching these two, it instantly clicked that it is perhaps potential to do the entire GPT-2 mannequin in a spreadsheet.”

We requested: Did he have any problem implementing a LLM in a spreadsheet? “The precise algorithm for GPT2 is generally loads of math operations which is ideal for a spreadsheet,” he says. “The truth is, the toughest piece is the place the phrases are transformed into numbers (a course of known as tokenization) as a result of it is textual content processing and the one half that is not math. It might have been simpler to do this half in a conventional programming language than in a spreadsheet.”

When Anand wanted help, he naturally received a bit of assist from GPT-2’s descendant: “Notably ChatGPT itself was very useful within the course of in phrases serving to me remedy thorny points I might come throughout or understanding varied levels of the algorithm, however it might additionally hallucinate so I needed to double-check it lots.”

GPT-2 rides once more

This entire feat is feasible as a result of OpenAI launched the neural community weights and supply code for GPT-2 in November 2019. It is notably attention-grabbing to see that exact mannequin baked into an academic spreadsheet as a result of when it was introduced in February 2019, OpenAI was afraid to launch it—the corporate noticed the potential that GPT-2 is perhaps “used to generate misleading, biased, or abusive language at scale.”

Nonetheless, the corporate launched the complete GPT-2 mannequin (together with weights information wanted to run it domestically) in November 2019, however the firm’s subsequent main mannequin, GPT-3, which launched in 2020, has not acquired an open-weights launch. A variation of GPT-3 later shaped the idea for the preliminary model of ChatGPT, launched in 2022.