Why GPT-3 Sat for Two Years Before the World Noticed

On the four ingredients that made 2022 the AI moment, the interface nobody talks about, and a way of thinking about technological change that you can use for the rest of your career.


A model nobody cared about

In June of 2020, OpenAI released GPT-3. It was, at the time, the largest language model ever built — 175 billion parameters, trained on 45 terabytes of text, capable of writing essays, answering questions, generating code, and producing prose that was, to many readers, indistinguishable from human writing. The technical press covered it with a mix of awe and anxiety. Researchers called it a breakthrough. Sam Altman, OpenAI’s CEO, publicly warned people not to overhype it.

And then, for about two and a half years, almost nobody outside of the AI research community used it.

GPT-3 was available through an API — a programmer’s interface that required you to write code to interact with the model. If you were a developer, you could build applications on top of it. If you were a researcher, you could run experiments with it. If you were a normal person who wanted to ask it a question, you couldn’t. There was no place to type. There was no chat window. There was no “talk to GPT-3” button anywhere on the internet. The most powerful language model in the world was sitting behind a developer console, waiting for someone to build a front door.

On November 30, 2022, OpenAI built the front door. They called it ChatGPT. Within five days, it had a million users. Within two months, it had a hundred million — making it the fastest-growing consumer application in the history of the internet. The technology that had been sitting quietly for two and a half years became, overnight, the most talked-about product on Earth.

Here is the question I want to spend this post answering, because the answer teaches you something that goes far beyond AI: why did that particular tool, in that particular moment, work?

The short answer is that November 30, 2022 wasn’t a single breakthrough. It was a confluence — four ingredients arriving at the same table, finally in the right amounts, at the right time. And none of them, alone, would have been enough.

Ingredient One: The architecture — Attention Is All You Need

The foundational ingredient was a technical breakthrough that happened five years before ChatGPT launched, in a paper that almost nobody outside of machine learning has read.

In 2017, a team of eight researchers at Google — Ashish Vaswani and seven co-authors — published a paper titled “Attention Is All You Need” in the proceedings of the NeurIPS conference. The paper introduced a new neural network architecture called the transformer, and it changed everything.

Before the transformer, the dominant architectures for processing language were recurrent neural networks and their variants (LSTMs, GRUs), which processed text sequentially — one word at a time, left to right, maintaining a running memory of what had come before. This worked, but it was slow, and the running memory degraded over long sequences. The transformer replaced this sequential processing with something called self-attention, a mechanism that allows the model to look at every word in a sequence simultaneously and learn which words are most relevant to which other words, regardless of how far apart they are. The result was dramatically faster training (because you could parallelize the computation across all words at once instead of processing them one at a time) and dramatically better performance on long-range dependencies (because the model could directly attend to a word five hundred tokens back instead of hoping the running memory hadn’t degraded by then).

The transformer is the T in GPT. Without it, none of the models that followed — GPT-2, GPT-3, GPT-4, BERT, PaLM, Claude, Gemini — would have been possible. It is the architectural foundation on which the entire current generation of AI is built. And it was published five years before ChatGPT, in a paper that most people who use ChatGPT every day have never heard of.

Ingredient Two: Scale — the discovery that more is different

The architecture was necessary but not sufficient. The transformer made it possible to build large language models. The second ingredient was the discovery that making them very large produced qualitatively different behavior — not just incremental improvement, but the emergence of capabilities that smaller models simply did not have.

GPT-2, released in 2019, had 1.5 billion parameters. GPT-3, released in 2020, had 175 billion — more than a hundred times larger. The jump was not just quantitative. GPT-3 could do things GPT-2 could not do at all: few-shot learning (performing a new task after being shown just a few examples), zero-shot reasoning (attempting a task it had never been trained on), code generation, and coherent long-form writing. These capabilities were not programmed in. They emerged from scale — from the combination of a larger model, more training data, and more computation.

This phenomenon — capabilities appearing suddenly as models get bigger, rather than improving gradually — is one of the most debated and fascinating findings in modern AI research. The researchers who built these systems did not predict many of the emergent capabilities in advance. They built bigger models, ran them, and discovered that the bigger models could do things nobody had told them to do. Scale, it turned out, was not just “more of the same.” Scale was a phase transition, the way heating water doesn’t just make it hotter water — at some point, it becomes steam. GPT-3 was steam. GPT-2 was hot water. Same substance, different state.

Ingredient Three: The polish — RLHF and the InstructGPT breakthrough

Here is where most popular accounts of the ChatGPT story stop. Architecture plus scale equals ChatGPT. Transformer plus big data equals AI revolution. That framing is not wrong, but it is incomplete, because it leaves out the ingredient that made GPT-3 go from “impressive but unreliable research tool” to “thing your grandmother can use.”

In March 2022 — eight months before ChatGPT launched — OpenAI published a paper describing a model called InstructGPT. The paper introduced a training technique called Reinforcement Learning from Human Feedback, or RLHF. The idea was simple in concept: after training the model on text data (the standard approach), you add a second phase where human evaluators rate the model’s outputs for quality, helpfulness, and safety. Those ratings are used to train a separate “reward model,” which is then used to fine-tune the original model’s behavior through reinforcement learning. The model learns, in effect, what humans consider a good response versus a bad one, and it adjusts its behavior accordingly.

The difference was measurable and dramatic. InstructGPT, despite being much smaller than GPT-3 (1.3 billion parameters versus 175 billion), was preferred by human evaluators over GPT-3 in head-to-head comparisons. A model one-hundredth the size was producing outputs that humans rated as better, because the RLHF training had aligned the model’s behavior with what humans actually wanted — clear answers, helpful explanations, honest caveats — rather than what was statistically likely in the training data.

RLHF is the polish layer. Without it, you have a brilliant but erratic model that sometimes produces genius and sometimes produces nonsense with equal confidence. With it, you have a model that behaves like a reasonably helpful, reasonably cautious conversational partner. The raw capability came from the architecture and the scale. The usability came from the polish. And usability, as we are about to see, turned out to be the ingredient that mattered most.

Ingredient Four: The interface — and this is the one nobody talks about

I want to make a claim that I think is underappreciated in most analysis of the 2022 AI moment, and I want to make it as clearly as I can, because it has implications far beyond this one product launch.

The interface was the breakthrough.

Not the model. Not the scale. Not the RLHF polish. The interface. The chat window. The simple, empty text box with a cursor blinking in it, waiting for you to type a question in plain English and get an answer back in plain English. That was the thing that changed everything.

GPT-3 had existed for two and a half years. It had 175 billion parameters. It could write essays and generate code and hold conversations. And almost nobody used it, because the only way to interact with it was through an API — a programmer’s tool, designed for programmers, accessible only to people who could write code. The most powerful language model in the world was locked behind a developer console, and the lock was not technical. It was experiential. There was no way for a normal human being to sit down and talk to the thing.

ChatGPT changed one thing: it gave the model a face. A chat window. A conversational interface that reframed the interaction from “submit a prompt to a language model and parse the JSON response” to “ask a question and get an answer.” The underlying technology was GPT-3.5 — an improvement over GPT-3, but not a revolutionary leap. The revolutionary leap was the interface, which took a research tool and turned it into a social experience.

One observer put it perfectly: ChatGPT shifted the user’s relationship to the model from “a piece of writing for the model to finish” to “a question calling for an answer.” That is not a technical change. That is a design change. It is a change in how the human being sitting at the keyboard understands what they are doing. And that change — the reframing — is what produced a hundred million users in two months.

This is the ingredient most analyses leave out. They give you three: transformer architecture, scale, RLHF. And those three are necessary. But without the fourth — the interface that made the technology legible to a non-technical human being — GPT-3 would still be sitting behind its API, used by developers, unknown to the public, waiting for someone to build the front door.

The front door was the breakthrough.

The wave that proves the confluence

If 2022 had only produced ChatGPT, you could argue that the moment was about one product, one company, one good decision about interface design. But 2022 didn’t produce just ChatGPT. In an eight-month window, it produced an entire wave of generative AI tools that hit the public simultaneously:

DALL-E 2 was announced in April 2022 and opened to everyone on September 28. Midjourney entered open beta on July 12, 2022. Stable Diffusion was publicly released in August 2022. ChatGPT launched on November 30, 2022.

Four major generative AI products, from different companies, using different architectures, applied to different media (images and text), all arriving in the same eight-month window. That is not a coincidence. That is a systemic convergence — a moment when multiple ingredients that had been developing independently for years reached a threshold simultaneously.

The transformer architecture (2017) had matured enough to support both language and image generation at scale. The scale of available training data and computation had crossed a critical threshold. The alignment and fine-tuning techniques (RLHF for language, classifier-free guidance for images) had gotten good enough to produce usable outputs. And the interfaces — Discord bots for Midjourney, web apps for DALL-E and Stable Diffusion, a chat window for ChatGPT — had finally made the technology accessible to non-technical users.

No single ingredient caused the wave. The wave was the confluence of all of them.

Confluence thinking: a lens you can use for the rest of your career

Here is the part of this post I want you to carry forward long after the specific details of transformers and RLHF have been superseded by whatever comes next.

The 2022 AI moment was not a single breakthrough. It was a confluence — a moment when multiple independent ingredients, each developing on its own timeline, converged in a way that produced something none of them could have produced alone. The transformer was necessary. Scale was necessary. RLHF was necessary. The interface was necessary. Remove any one of them and the moment doesn’t happen. The model without the interface sits unused. The interface without the model has nothing to offer. The model without RLHF is too erratic to trust. The RLHF without the scale has nothing to polish.

This pattern — big shifts come from multiple ingredients aligning, not single breakthroughs — is not unique to AI. It is how almost every major technological change in history actually happened, if you look closely enough.

The printing press was a confluence: movable type (which had existed in China centuries earlier), oil-based ink (which Gutenberg adapted from painting), the wine press (which he repurposed as the mechanical frame), and affordable paper (which had recently become available in Europe). Remove any one of those ingredients and Gutenberg’s press doesn’t work, or doesn’t scale, or doesn’t transform European civilization the way it did.

The iPhone was a confluence: capacitive multi-touch screens, miniaturized processors, mobile broadband networks, and a software ecosystem (the App Store) that let third-party developers build on top of it. Remove any one of those and you get a different, lesser product — a PDA, a fancy phone, a media player, but not the thing that remade the world.

The pattern repeats across every domain of human invention. The breakthrough that looks, in retrospect, like a single moment is almost always, on closer inspection, a confluence of ingredients that were developing independently and converged at a specific point in time.

Confluence thinking is the habit of looking at a new technology — or any major shift — and asking: what are the ingredients? How many are there? Which ones are mature and which ones are still developing? And what happens when the last one crosses the threshold? This is a thinking tool, not a technical skill. It works whether you’re analyzing AI, evaluating a startup, reading a history book, or planning your own career. The person who can identify the ingredients of a coming confluence before the confluence happens is the person who is standing in the right place when the wave arrives.

What I want you to take with you

The next time somebody tells you that a single product or a single company or a single genius “invented” something that changed the world, be polite, and then look for the ingredients. The world almost never changes because of one thing. It changes because of four things, or five, or six, arriving at the same table at the same time, each one necessary and none of them sufficient.

GPT-3 sat for two and a half years before the world noticed. The model was ready. The scale was there. The polish was coming. But the front door hadn’t been built yet. When somebody finally built the front door — a simple chat window, a blinking cursor, a place to type a question in plain English — a hundred million people walked through it in two months.

The model was the engine. The interface was the door. The door was the breakthrough.

Remember that. Not just for AI. For everything you will ever build. The most powerful engine in the world is useless if nobody can find the door.

Build the door.


Sources and further reading

On the transformer architecture: Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. (2017), “Attention Is All You Need,” Advances in Neural Information Processing Systems (NeurIPS 2017). This paper introduced the transformer architecture and has been cited tens of thousands of times. It is the foundational technical document for the entire current generation of large language models.

On GPT-3 and the discovery of emergent capabilities at scale: Brown, T. B., et al. (2020), “Language Models are Few-Shot Learners,” Advances in Neural Information Processing Systems (NeurIPS 2020). This is the GPT-3 paper, documenting the 175-billion-parameter model and its few-shot and zero-shot learning capabilities. GPT-3 was released via API in June 2020 and licensed exclusively to Microsoft in September 2020.

On InstructGPT and RLHF: Ouyang, L., et al. (2022), “Training language models to follow instructions with human feedback,” arXiv preprint arXiv:2203.02155 (later published in NeurIPS 2022). This paper describes the InstructGPT model and the RLHF technique that became the polish layer for ChatGPT. The finding that a 1.3B-parameter InstructGPT model was preferred by human evaluators over the 175B-parameter GPT-3 is one of the most striking results in the alignment literature.

On ChatGPT’s launch and adoption: ChatGPT was released on November 30, 2022, built on GPT-3.5 (a fine-tuned variant of GPT-3). It reached 1 million users in five days and 100 million users in approximately two months, making it the fastest-growing consumer application in internet history at the time.

On the 2022 generative AI wave timeline: DALL-E 2 was announced April 6, 2022, entered beta in July, and opened to the public September 28, 2022. Midjourney entered open beta July 12, 2022. Stable Diffusion was publicly released August 2022. ChatGPT launched November 30, 2022. All four products arrived within an eight-month window, from different companies, using different architectures, applied to different media.

On the interface-as-breakthrough framing: The observation that ChatGPT shifted the user’s relationship from “a piece of writing for the model to finish” to “a question calling for an answer” draws on analysis from multiple sources covering the launch, including coverage in The Verge, The New York Times, and the essay “What Was ChatGPT?” (2025) at cyberneticforests.com.

On confluence as a pattern in technological change: The concept of technological convergence is discussed across the innovation literature. For the printing press example: Eisenstein, E. L. (1979), The Printing Press as an Agent of Change, Cambridge University Press. For the general principle that major innovations are typically confluences of multiple independent advances: Arthur, W. B. (2009), The Nature of Technology: What It Is and How It Evolves, Free Press.

Note to readers: the four-ingredient framework presented here — transformer architecture, scale, RLHF, and the chat interface — synthesizes concepts that exist independently in the technical literature. The contribution of this post is the specific framing, weighting, and connection of those ingredients, particularly the elevation of the interface as a co-equal ingredient alongside the technical components. The concept of “confluence thinking” as a transferable analytical lens is, to my knowledge, original to this series. Verify the primary sources yourself before quoting.

The AI That Saved $25 Million a Year and Couldn’t Save the Company That Built It

The story of XCON, the first commercially successful expert system — and what its triumph and its company’s collapse can teach every builder about the difference between solving a problem and leading an organization.


A company drowning in its own success

In 1978, Digital Equipment Corporation had a problem that was, in a strange way, the best kind of problem to have. They were selling too many computers and couldn’t keep up.

DEC — the second-largest computer company in the world, behind only IBM — built the VAX, a family of powerful minicomputers that businesses could customize to their specific needs. The selling point was the customization: each VAX system was configured from thousands of individual components — processors, memory modules, disk drives, controllers, cables, cabinets, power supplies — assembled into a unique combination tailored to what the customer ordered.

The problem was that configuring these systems required deep technical expertise, and even the experts got it wrong. A lot. If a customer ordered a disk drive, someone had to make sure the order also included the right disk controller, the right cables, the right power supply for the additional load, and the right cabinet space to house it all. A single VAX system could involve thousands of separate components, and the relationships between them were complex, interdependent, and poorly documented. Human configurators were getting orders wrong somewhere between 30 and 40 percent of the time. Wrong components shipped. Incompatible parts arrived at the customer site. Systems that should have worked didn’t. The manual configuration process was taking ten to fifteen weeks per order. DEC was hemorrhaging money on returns, rework, and angry customers — and the more systems they sold, the worse the problem got.

Into this mess walked a researcher from Carnegie Mellon University named John McDermott.

The man with 2,500 rules

McDermott was a specialist in artificial intelligence, specifically in a branch of AI called expert systems — software designed to capture the decision-making knowledge of human experts and apply it systematically. The idea was simple in theory: interview the people who know how to configure a VAX correctly, translate their knowledge into a set of if-then rules, and let the computer apply those rules to every incoming order, consistently, without fatigue, without mistakes, and without taking fifteen weeks to do it.

The practice was anything but simple. McDermott spent months interviewing DEC’s technical configurators, extracting their knowledge rule by rule. What he found was illuminating and a little alarming: the experts didn’t always agree with each other. Different configurators had different approaches to the same problem, different rules of thumb, different preferences. Part of building the system was not just capturing expertise but reconciling it — finding the underlying logic beneath the disagreements and encoding a single, consistent decision process that incorporated the best of what everyone knew.

The system McDermott built was called R1, later renamed XCON — short for eXpert CONfigurer. It was written in a rule-based programming language called OPS5, and when it went into production use at DEC’s plant in Salem, New Hampshire in 1980, it contained roughly 2,500 if-then rules. Each rule encoded a small piece of configuration logic: if the customer ordered this drive, then include this controller. If this controller is present and this cabinet is full, then add a second cabinet. If the power draw exceeds this threshold, then upgrade the power supply.

Twenty-five hundred rules, each one a tiny piece of a human expert’s hard-won knowledge, running in sequence against every incoming order, catching the errors that human configurators missed thirty to forty percent of the time.

The results were not subtle.

$25 million a year and 95% accuracy

By 1986, XCON had processed over 80,000 orders. Its accuracy rate was between 95 and 98 percent — a dramatic improvement over the 60 to 70 percent accuracy of the manual process. Configuration time dropped from weeks to minutes. The system was saving DEC an estimated $25 million per year in avoided errors, reduced rework, and faster order fulfillment.

XCON became the most famous success story in the history of expert systems. It proved, in a way that no academic demonstration had, that AI could work in a real industrial setting, at scale, on a problem that mattered. It wasn’t a research prototype. It wasn’t a toy. It was a production system processing every VAX order that came through the door, day after day, and it was doing the job better than the humans it was designed to assist.

The AI community celebrated. Papers were published. Conferences were held. Other companies rushed to build their own expert systems, hoping to replicate DEC’s success. The mid-1980s saw what historians of AI now call the “expert systems boom” — a wave of investment and enthusiasm driven in large part by XCON’s demonstrated commercial value. For a few years, it looked like expert systems were going to be the future of artificial intelligence, and XCON was the proof.

Meanwhile, the company that built XCON was starting to die.

The company that couldn’t see what was coming

Here is the part of the story that almost nobody tells when they talk about XCON, and it is the part that matters most for anyone who wants to build things that last.

Digital Equipment Corporation in 1988 was a colossus. Eleven and a half billion dollars in annual revenue. A hundred and twenty-five thousand employees. Operations in more than eighty countries. The number two computer company on Earth, behind only IBM. Fortune magazine called it one of the most admired companies in America.

And its founder, Ken Olsen — a brilliant engineer, a genuine visionary in the minicomputer era — was about to make a series of decisions that would destroy it.

The first and most famous was his stance on personal computers. In 1977, Olsen said publicly that there was “no reason for any individual to have a computer in his home.” When DEC’s own engineers demonstrated two prototype microcomputers in 1974 — before the Altair, before the Apple I — Olsen chose not to proceed. When another personal computer proposal came in 1977, he rejected that too. The market that would eventually eat the minicomputer alive was growing right in front of him, and he looked at it and saw nothing worth pursuing.

The second decision was to go upmarket instead of down. Rather than competing in the emerging PC space, DEC decided in the mid-1980s to challenge IBM in the high-end mainframe market. The result was the VAX 9000, a massive engineering effort that consumed an estimated $3 billion in development capital and landed in a market that didn’t want it. The VAX 9000 was a commercial failure. Three billion dollars, gone.

The third was a pattern of organizational dysfunction that Edgar Schein, the MIT organizational psychologist who studied DEC extensively, documented in his book DEC Is Dead, Long Live DEC. The company’s engineering-driven culture, which had been its greatest strength in the minicomputer era, became its greatest liability as the market shifted. Engineers were empowered to pursue their own projects with minimal coordination. Product lines multiplied and competed with each other internally. The company couldn’t execute a coherent strategy because it couldn’t agree on what the strategy should be.

By the early 1990s, DEC’s minicomputer sales were collapsing. The first layoffs came. Then more layoffs. Then more. The company that had employed 125,000 people began hemorrhaging talent and revenue simultaneously.

And here is the detail that should make every technologist in the world sit up and pay attention: during this entire period of decline, XCON was still working. The AI was still processing orders. The AI was still saving millions of dollars a year. The AI was still doing its job at 95 to 98 percent accuracy, day after day, exactly as it had been designed to do.

XCON could not save DEC, because XCON’s job was to configure VAX systems correctly, and the problem that was killing DEC had nothing to do with VAX configuration. The problem was leadership. The problem was strategy. The problem was a founder who couldn’t see that the market had moved, a management structure that couldn’t coordinate a response, and a $3 billion bet on the wrong product at the wrong time.

The AI solved the problem it was pointed at. The company died of the problems nobody pointed anything at.

The AltaVista footnote

I want to add one more detail, because it is so painful it borders on poetry.

In 1995, in the middle of its decline, DEC launched an internet search engine called AltaVista. It was, for a brief period, the most popular search engine in the world — dominant, widely loved, technically excellent. If you used the internet in the late 1990s, you probably used AltaVista.

According to multiple sources, in 1997, two Stanford PhD students named Larry Page and Sergey Brin approached DEC with their PageRank system, hoping to be acquired. They wanted AltaVista to adopt their technology — or, failing that, to be brought into the company. DEC, deep in its death spiral, passed.

Page and Brin went on to found Google.

DEC was sold to Compaq in 1998 for $9.6 billion — a fraction of what the company had been worth a decade earlier. AltaVista was eventually shut down. Google became one of the most valuable companies in the history of civilization.

The company that built the first commercially successful AI system also had the first dominant search engine and was offered the technology that would become Google. It lost all three — not because the technology failed, but because the leadership couldn’t see what they had.

What this story is actually about

I did not tell you this story so you could feel sorry for DEC. DEC’s collapse is forty years old and the company is gone. I told you this story because it contains, compressed into a single corporate biography, the most important lesson a builder can learn — and it is a lesson that is not about technology at all.

The lesson is this: a brilliant technical solution, perfectly executed, applied to the wrong level of the problem, will not save you.

XCON was a perfect solution to a real problem. Configuration errors were costing DEC millions. XCON fixed that. XCON did exactly what it was built to do, and it did it beautifully, for years, without fail. If the only thing wrong with DEC had been configuration errors, XCON would have saved the company.

But the things wrong with DEC were not technical. They were strategic. They were organizational. They were about vision, leadership, market awareness, and the willingness to cannibalize your own success before your competitors do it for you. No expert system in the world can solve those problems, because those problems live in the minds and decisions of the people running the organization, not in the systems the organization uses.

This is the mistake technologists make over and over again, in every era, with every new technology. They build something brilliant. It works. It solves a real problem. And then they assume that solving a real problem is the same thing as building a successful organization. It isn’t. Solving a real problem is necessary. It is not sufficient. The gap between “our technology works” and “our company thrives” is filled with strategy, leadership, market timing, organizational design, and a hundred other things that have nothing to do with how elegant your code is or how many rules are in your expert system.

What I want you to take with you

If you are reading this as a student — someone learning to build things, someone at the beginning of a career in technology or game design or immersive environments or any of the other fields this academy touches — I want you to hold two things in your mind at once.

The first is respect for the craft. John McDermott built something remarkable. Twenty-five hundred rules, extracted painstakingly from human experts who didn’t always agree, assembled into a system that processed eighty thousand orders at near-perfect accuracy and saved a company $25 million a year. That is beautiful work. That is the kind of work you should aspire to do — deep, careful, expert, and genuinely useful. The craft matters. The technology matters. Do not let the rest of this story make you cynical about the value of building things well.

The second is a clear-eyed understanding that the craft is not the whole game. DEC had the best expert system in the world. DEC had the best search engine in the world. DEC had access to the technology that would become Google. And DEC is gone, because the people running the company could not see the market changing around them, could not coordinate a strategic response, and could not make the painful decisions that survival required.

The builder who only knows how to build is McDermott. McDermott did his job perfectly. The company died anyway, and there was nothing McDermott could have done about it from where he sat.

The builder who knows how to build and how to lead — who understands technology and strategy, who can write the code and read the market, who can solve the technical problem and see the organizational one — is the builder who survives. That is the builder this academy is training you to be. Not just a technician. Not just an engineer. A producer. A leader. Someone who can look at the whole board, not just the piece in front of them.

XCON was a masterpiece. The company it lived inside was a cautionary tale. Hold both of those truths at the same time, and you will be better prepared than most of the people you will ever compete with.

Build brilliantly. Lead wisely. And when somebody offers you the next Google, for the love of everything, don’t pass.


Sources and further reading

On XCON (R1) — development, architecture, and impact: McDermott, J. (1982), “R1: A Rule-Based Configurer of Computer Systems,” Artificial Intelligence, 19(1), 39-88. Also: Bachant, J., and McDermott, J. (1984), “R1 Revisited: Four Years in the Trenches,” AI Magazine, 5(3). The ACM case study: “Expert Systems for Configuration at Digital: XCON and Beyond,” Communications of the ACM, 1989. The system processed 80,000+ orders by 1986 with 95-98% accuracy and saved DEC an estimated $25 million annually.

On Digital Equipment Corporation — rise and fall: Schein, E. H. (2003), DEC Is Dead, Long Live DEC: The Lasting Legacy of Digital Equipment Corporation, Berrett-Koehler Publishers — the definitive organizational analysis by the MIT psychologist who studied DEC for decades. For the corporate timeline: “Digital Equipment Corporation,” Wikipedia, which compiles the $11.5B peak revenue (1988), 125,000 employees, the Ken Olsen “no reason for any individual to have a computer in his home” quote (1977), the rejected PC prototypes (1974, 1977), and the Compaq acquisition ($9.6B, 1998).

On the VAX 9000 failure: Multiple sources estimate the development cost at approximately $3 billion. The system was a commercial failure in the high-end mainframe market that DEC was attempting to enter in competition with IBM.

On AltaVista and the Google connection: AltaVista launched in 1995 as DEC’s internet search engine and was briefly the most popular search engine in the world. The account of Larry Page and Sergey Brin approaching DEC in 1997 appears in multiple sources covering DEC’s decline, though the details vary. What is documented is that DEC passed on the opportunity, Page and Brin founded Google, and AltaVista was eventually discontinued.

On the expert systems boom of the 1980s: The commercial success of XCON was a major catalyst for the expert systems investment wave of the mid-1980s. For context on the broader AI timeline: Russell, S. J., and Norvig, P. (2020), Artificial Intelligence: A Modern Approach (4th ed.), Pearson — Chapter 1 provides a historical overview including the expert systems era and the subsequent AI winter.

Note to readers: verify the primary sources yourself before quoting. The DEC story in particular has been told many ways by many people, and the details — especially around the AltaVista/Google connection — vary across accounts. The citations above are entry points into a much larger and contested corporate history.

Five Thousand Years to Get Here

A brief history of every tool humanity ever built to teach its children — and the one that finally broke the pattern.

By D.W. Denney


Every tool on the same curve

I want to tell you a story that covers five thousand years and fits on the back of a napkin. It’s the story of every educational technology humanity has ever invented, and the punchline is that until very recently, they were all doing the same thing.

Here’s the napkin version. Somebody knows something. They need to get it into somebody else’s head. Every tool we’ve ever built for that purpose — every single one, across all of recorded history — has been a more efficient way to do one of four things: store information, distribute information, drill information into memory, or assess whether the information stuck. That’s it. Four functions. Five millennia. One curve.

Let me walk you through the timeline, and watch how the technology changes while the function doesn’t.

Oral tradition. Before writing, knowledge lived in the mouths of elders and was transferred by speech. The teacher spoke. The student listened, repeated, and memorized. If the elder died before the transfer was complete, the knowledge died with them. The storage medium was the human brain. The distribution method was the human voice. The range was the distance sound carries across a campfire. This worked, and it worked for a long time, and the stories and songs and genealogies that survived this era are a testament to how powerful the human memory can be when it has no other option. But the system was fragile. One forgotten line, one dead elder, one scattered tribe, and the knowledge was gone.

Writing. Sometime around 3200 BCE, the Sumerians started pressing wedge-shaped marks into wet clay tablets. The Egyptians wrote on papyrus. The Greeks and Romans wrote on parchment. The function was the same as oral tradition — store information and transmit it — but the storage medium had changed. Knowledge was no longer dependent on a living memory. It could survive the death of the person who knew it. This was an enormous leap, and it changed everything about how civilizations accumulated knowledge across generations. But the teaching model didn’t change. A teacher still stood in front of students and talked. The students still listened and memorized. The writing was a backup, not a replacement.

The printing press. In the 1440s, Johannes Gutenberg built a machine that could produce identical copies of a written text at a speed and cost that handwriting could not match. The function was the same as writing — store and distribute information — but the distribution had scaled. A book that previously existed in three handwritten copies could now exist in three hundred, then three thousand. The implications for education were staggering: for the first time, a student could own the same text as the teacher. The textbook was born. But the teaching model still didn’t change. The teacher still lectured. The students still listened. The textbook was a reference, not a tutor.

The chalkboard. In the early 1800s, a large slate surface mounted on a classroom wall gave teachers the ability to write and draw in real time, visible to an entire room of students. The function was the same as a lecture — distribute information — but the channel had expanded from purely auditory to auditory-visual. The teacher could now show as well as tell. This was a genuine improvement in the richness of the instructional experience. But the model didn’t change. One teacher, many students, information flowing in one direction.

Pencil, paper, and the workbook. The mass production of cheap paper and reliable pencils in the 1800s gave every student their own surface to work on. The function was practice and assessment — drill the information, test whether it stuck. Flashcards, worksheets, and workbooks followed. Spaced repetition was discovered and formalized. All of these were improvements in the efficiency of a very old function: getting information from short-term memory into long-term memory through structured repetition. The model didn’t change. The student still practiced alone, and the teacher still graded the result after the fact.

Radio, film, and television. Starting in the 1920s, electronic broadcast media made it possible to deliver a lecture to thousands or millions of students simultaneously. Educational radio, instructional films, and later educational television (think Sesame Street, the single most studied educational intervention in the history of broadcast media) all did the same thing: distribute a lecture at scale. A great teacher could now reach students who would never have had access to that teacher in person. This was a real and important advance. But the model didn’t change. The lecture was still one-directional. The student still sat and received. The broadcast didn’t know whether the student understood, or was confused, or had fallen asleep.

The personal computer and educational software. Starting in the 1980s, computers in classrooms and homes delivered interactive drills, educational games, and multimedia presentations. The function was practice and assessment — the same function as the workbook — but the medium was now digital, which meant the drill could be adaptive (harder questions if you got the last one right, easier ones if you didn’t) and the feedback could be immediate (a green checkmark or a red X, right now, instead of a graded paper returned next Tuesday). This was a genuine improvement. But the model didn’t change. The software presented material. The student responded. The software evaluated the response. It was a faster, flashier workbook.

The internet and the LMS. Starting in the 1990s, the internet made it possible to distribute lectures, textbooks, workbooks, and assessments to anyone with a connection, anywhere in the world. Canvas. Blackboard. Moodle. Khan Academy. Coursera. All of these are, at their core, digital infrastructure for doing the same four things humanity has been doing since the Sumerians: store information, distribute information, drill it into memory, and test whether it stuck. Khan Academy’s innovation was putting a world-class lecture series on YouTube for free. Coursera’s innovation was putting university courses online with automated grading. Both were genuine advances in access and distribution. Neither changed the fundamental model. The student still watched, practiced, and was assessed. The system still didn’t know who they were or what they were struggling with or why.

The pattern

Do you see it? Every technology on that timeline improved the efficiency of one of the four functions. Writing improved storage. The printing press improved distribution. The chalkboard improved the richness of the lecture. The workbook improved the efficiency of drill. The computer improved the speed of feedback. The internet improved the reach of all of the above. Each one was a genuine advance, and I don’t want to diminish any of them — the printing press alone arguably created the modern world.

But none of them changed the model. The model, from the campfire to the LMS, has always been the same: one teacher, many students, information flowing in one direction, with periodic checks to see if the students absorbed it. The ratio has changed. The speed has changed. The medium has changed. The model has not. For five thousand years, humanity has been building faster, cheaper, more widely distributed versions of the same four-function educational machine.

And there’s a reason for that, and the reason has a name.

The Two Sigma Problem

In 1984, an educational psychologist at the University of Chicago named Benjamin Bloom published a paper in Educational Researcher that remains one of the most cited — and most haunting — papers in the history of education. The paper was called “The 2 Sigma Problem: The Search for Methods of Group Instruction as Effective as One-to-One Tutoring.”

Bloom and his graduate students had conducted a straightforward experiment. They divided students into three groups. The first group received conventional classroom instruction — one teacher, thirty students. The second group received the same instruction but with a structured feedback-and-correction system called mastery learning. The third group received one-on-one tutoring with mastery learning techniques.

The results were not subtle. The average student in the tutoring group performed two standard deviations above the average student in the conventional classroom. In practical terms, that means the average tutored student scored better than 98 percent of the students in the conventional class. The tutored students didn’t just do a little better. They occupied a different universe of performance. Roughly 90% of the tutored students reached a level of achievement that only the top 20% of the conventional class reached.

Bloom’s finding confirmed something that educators and wealthy parents had known for centuries: one-on-one tutoring, where a knowledgeable person sits with a single student and adapts their instruction in real time to that student’s specific needs, confusions, and pace, is catastrophically more effective than anything else we know how to do. It’s not 10% better. It’s not twice as good. It is in a different category entirely.

And then Bloom named the problem. He called it the Two Sigma Problem, and stated it with painful clarity: one-on-one tutoring produces extraordinary results, but it is “too costly for most societies to bear on a large scale.” You can’t give every student a personal tutor. The math doesn’t work. There aren’t enough tutors, and even if there were, no society could afford to pay them. Bloom’s challenge to the field was to find methods of group instruction that could approximate the results of one-on-one tutoring.

I want to be honest about something here, because this is a scholarly blog and you deserve the full picture. Bloom’s original two-sigma claim has been scrutinized in the decades since 1984, and there are legitimate questions about whether the effect is quite as large as he reported. A more recent analysis in Education Next pointed out that the original studies held tutored students to a higher mastery standard (90%) than classroom students (80%), which may have inflated the comparison. The broader meta-analytic literature suggests that the true effect of tutoring may be somewhat smaller than two full standard deviations. But even the most conservative readings of the data agree that one-on-one tutoring produces a large effect — substantially larger than any other instructional intervention that has been reliably measured. The core of Bloom’s insight stands: personalized, responsive, one-on-one instruction is dramatically better than anything else, and for five thousand years it has been available only to the few who could afford it.

Aristotle tutoring Alexander the Great. Royal tutors educating future monarchs. Wealthy families hiring private instructors for their children. The best educational technology in human history has always been a single knowledgeable human being, sitting with a single student, paying attention to that student and only that student, and adapting in real time. Everything else — the textbooks, the lectures, the software, the LMS platforms — has been an attempt to approximate that experience at scale, and every approximation has fallen short by a measurable and significant margin.

For forty years, Bloom’s Two Sigma Problem stood as an open challenge. Find a way to give every student the equivalent of a personal tutor. Nobody solved it. The tools kept getting better — faster, cheaper, more accessible — but they stayed on the same curve. They were still doing the same four things. Store, distribute, drill, assess. The model didn’t change.

November 30, 2022

And then a chatbot launched, and the curve broke.

I want to be careful here, because the hype around generative AI in education is already thick enough to choke on, and I don’t want to add to it thoughtlessly. ChatGPT did not solve education. It did not make human teachers obsolete. It did not fulfill Bloom’s challenge overnight. It is not a replacement for a great teacher, and anyone who tells you it is should not be trusted.

But here is what it did do, and this is the part I want you to see clearly, because it is genuinely new.

For the first time in roughly five thousand years of recorded educational history, a student sat down in front of a tool that was not a faster textbook, not a recorded lecture, not a digital workbook, not an adaptive quiz. It was a conversational, responsive, infinitely patient entity that met the student where they were. It could answer a question. It could answer the follow-up question. It could explain the same concept three different ways until one of them clicked. It could notice that the student was confused about a prerequisite and back up to fill the gap. It could work at 2 AM, on a Sunday, in a language the student’s school didn’t teach in, without getting tired, without getting frustrated, without checking the clock.

None of the tools on the five-thousand-year timeline could do any of that. The printing press couldn’t answer a question. The chalkboard couldn’t notice confusion. Khan Academy couldn’t adapt its explanation in real time based on what a specific student said three seconds ago. Every prior tool was a one-directional delivery mechanism for information. This tool is a conversational partner — imperfect, sometimes wrong, sometimes confidently wrong, but conversational in a way that no educational technology before it has ever been.

The technical term for what happened is discontinuous innovation. Every prior educational technology fell on a continuous improvement curve — each one was a better, faster, cheaper version of the same basic functions. Generative AI did not improve the curve. It introduced a function that was not on the curve at all: real-time, adaptive, conversational, one-on-one instruction, available to anyone with an internet connection, at a cost approaching zero.

That is the function that, for all of human history, required a human tutor. A human tutor who was expensive, scarce, and therefore available only to the privileged. Bloom measured the advantage at two standard deviations. The advantage was available to Alexander the Great, to the children of European monarchs, to the kids whose parents could afford $200 an hour. It was not available to the kid in the rural school with one overwhelmed teacher and thirty-five students in the room.

It might be available now. Not perfectly. Not without caveats. Not without the very real risks of hallucination, of over-reliance, of the substitution of a machine for a human relationship. But the function — the conversational, responsive, adaptive, patient one-on-one instructional interaction — is, for the first time in the history of the species, not locked behind a price tag that only the wealthy can pay.

What I want you to take with you

I did not write this post to sell you on AI. I wrote it to give you perspective, because perspective is the thing the hype cycle steals first.

When you use an AI tutor — and you will, if you haven’t already — I want you to understand where it sits in the longest timeline you can hold in your head. Five thousand years of tools that stored, distributed, drilled, and assessed. Forty years of an unsolved problem that said the best form of education was too expensive for most of humanity. And then a tool that, for all its flaws, introduced a function that had never existed in an affordable, scalable form before.

That’s not hype. That’s history. The tool is imperfect. The tool will get better. The tool will also get misused, overhyped, poorly implemented, and blamed for things that aren’t its fault. All of that is going to happen, because it happens with every technology that matters.

But underneath all of that noise, something real has changed. The curve broke. A function that was previously available only to the privileged is now available to anyone who can type a question into a box. What humanity does with that — whether we waste it or build on it — is an open question, and some of the people who will answer it are reading this post right now.

The printing press didn’t make everyone literate. It took centuries of effort — schools, teachers, curricula, social movements — to turn the press into widespread literacy. The AI tutor will not make everyone educated. It will take effort, design, wisdom, and a lot of thoughtful builders to turn the tool into the transformation it could be.

Some of those builders are going to be you. The tool is here. The timeline delivered it. What you build with it is the next line on the napkin.

Write something worth reading.


Sources and further reading

On Bloom’s Two Sigma Problem: Bloom, B. S. (1984), “The 2 Sigma Problem: The Search for Methods of Group Instruction as Effective as One-to-One Tutoring,” Educational Researcher, 13(6), 4-16. Based on dissertation research by Joanne Anania and Joseph Arthur Burke at the University of Chicago.

On the scrutiny of Bloom’s original claims: von Hippel, P. T. (2025), “Two-Sigma Tutoring: Separating Science Fiction from Science Fact,” Education Next. This piece provides important context on the methodological limitations of the original studies, including the differing mastery thresholds between conditions, while affirming that the core finding of a large tutoring effect is supported by the broader literature.

On the broader meta-analytic evidence for tutoring effects: VanLehn, K. (2011), “The Relative Effectiveness of Human Tutoring, Intelligent Tutoring Systems, and Other Tutoring Systems,” Educational Psychologist, 46(4), 197-221. Also reviewed in the Nintil systematic review of Bloom’s Two Sigma Problem, which synthesizes mastery learning, tutoring, and direct instruction literatures.

On the history of educational technology: A comprehensive treatment of the progression from oral tradition through digital media can be found in Cuban, L. (1986), Teachers and Machines: The Classroom Use of Technology Since 1920, Teachers College Press. For the broader historical arc: Saettler, P. (2004), The Evolution of American Educational Technology, Information Age Publishing.

On the Aristotle-Alexander tutoring lineage: The canonical example of elite one-on-one tutoring in the ancient world is Aristotle’s tutorship of Alexander the Great, beginning around 343 BCE. Referenced in Bloom’s own framing and in the Education Next analysis.

On discontinuous innovation as a concept: The distinction between continuous (incremental) and discontinuous (paradigm-breaking) innovation is discussed broadly in the innovation literature. A useful entry point: Christensen, C. M. (1997), The Innovator’s Dilemma, Harvard Business School Press — though Christensen’s specific framework (disruptive vs. sustaining innovation) applies to market dynamics rather than pedagogical function.

Note to readers: verify the primary sources yourself before quoting. Bloom’s Two Sigma claim in particular has been the subject of forty years of debate, and the honest scholarly position is that tutoring has a large effect but the exact magnitude remains under discussion. The citations above are entry points into that discussion, not settlements of it.

The Permission to Not Know Everything

On the science of expertise, the art of knowing enough, and why the smartest move a producer can make is to choose what not to learn.


The guilt you’re carrying right now

You’re sitting in front of your computer, and somewhere in one of your open tabs there is a tutorial you should probably watch. Maybe it’s Blender. Maybe it’s Unity. Maybe it’s some new AI framework that just dropped last week and already has six thousand Twitter threads about why you’re behind if you haven’t tried it yet. You tell yourself you’ll get to it tonight. You tell yourself that every day. The list doesn’t get shorter. It gets longer. And underneath the list there’s a feeling you might not have named, but I bet you recognize it: I should know more than I do. Everyone else seems to know more. If I were serious about this, I’d have already learned that tool. What’s wrong with me?

Nothing is wrong with you. What’s wrong is the assumption underneath the guilt — the assumption that a serious professional should be working toward mastery of every tool in their field. That assumption is not just impractical. It is, according to a Nobel Prize-winning economist, mathematically impossible, and the research on expertise says it’s not even desirable.

I want to give you a framework that replaces the guilt with a decision. It’s called the Three-Tier Tool Fluency Model, and it does something simple but powerful: it takes every tool you will ever encounter in your career and asks you to sort it into one of three categories — not based on what the tool deserves, but based on what you need. Once you’ve made the sort, the guilt evaporates, because the guilt was never about the tools. It was about the absence of a decision.

Here are the three tiers, and the research behind each one.

Continue reading The Permission to Not Know Everything

The Architecture of Trust

What thirty years of research on organizational trust has to say about why some virtual communities feel safe and others feel dangerous — and how to build the kind that lasts.


The thing nobody tells you about trust

Here’s a thing you’ve probably experienced but never had a vocabulary for. You walk into a new online community — a Discord server, a game guild, a forum, a virtual world — and within about thirty seconds, before anyone has said a word to you, you have already made a judgment about whether you trust this place. Not whether you like it. Whether you trust it. Whether you are willing to put a small piece of yourself on the table and see what happens.

You can’t quite name what triggered the judgment. Something about the tone of the welcome message. Something about how organized the channels look. Something about whether the moderator names are visible or hidden. Something about whether the recent conversations feel warm or performative. You’re scanning for signals, dozens of them, faster than you can consciously process, and the aggregate of those signals produces a feeling that sits somewhere between “I could belong here” and “I should leave.”

Continue reading The Architecture of Trust

Beyond the Bullet Point List

How Cognitive Science and Neurodiversity Research Should Reshape the Way We Teach Complex Ideas


Open almost any online course, corporate training module, or educational slide deck in 2026 and you will find the same default gesture: dense content broken into bullet points. The bullet is the visual idiom of modern learning design. It signals clarity. It promises ease. For many of us, it is the first formatting move we make when a paragraph starts to feel “too long.”

Yet decades of cognitive science suggest that this default is often wrong — not slightly wrong, but consequentially wrong for the kinds of learning we say we care about most. The bullet is excellent at one thing (quick reference) and poor at something else entirely (building durable understanding of connected ideas). When we confuse these two goals, we produce materials that feel educational while failing to educate.

This article makes the case, from the research literature, for a more careful approach to formatting complex material — one that treats format not as decoration but as a cognitive variable that directly shapes what learners take away. We will look at what working memory can and cannot do, why prose and bullets operate on different cognitive systems, and what research on neurodivergent learners reveals about a common but mistaken assumption: that fragmenting information is always an act of accessibility. The truth, as is so often the case, is more interesting than the folk wisdom.

Continue reading Beyond the Bullet Point List

The Healing in the Headset

What the research actually says about virtual communities and mental health — and why the therapeutic power of virtual belonging turns out to be more real than most people expected.


A thing you already know but might not have words for

If you’ve ever spent real time in a virtual community — not just passing through, but actually living there, building things, forming relationships, coming back night after night to the same group of people — you already know something that the clinical research is only now catching up to. You know that the connections you formed in that space were real. You know that the support you received there mattered. You know that the person who stayed up until 2 AM talking you through a bad night wasn’t less of a friend because you’d never shaken their hand.

You also know that if you said any of this out loud to certain people, they’d look at you like you were describing an addiction. “You should get off the computer and make real friends,” they’d say. “Those aren’t real relationships.” And maybe you nodded, because the cultural script says they’re right, even though something inside you knew they were wrong.

The research says you were right and the script was wrong. Not in every case, not without nuance, and not without some genuine risks that are worth being honest about — but in ways that are documented, measured, and increasingly well-understood. Virtual communities are producing real therapeutic outcomes for real people, in populations that desperately need them. I want to walk you through four of the documented areas, because if you’re going to build virtual worlds, you need to understand that the spaces you create may end up being, for some of your users, the most important support system in their lives.

That’s a weight worth carrying carefully.

Continue reading The Healing in the Headset

Why Your Virtual Village Feels Like Home

The science of why people grieve when their Minecraft house burns down, trade favors with strangers they’ve never met in person, and develop inside jokes about things that never happened in the real world.


A house that isn’t there

Let me tell you about something that happens all the time and that almost nobody takes seriously. Somebody builds a house in a video game. A digital structure, made of digital blocks, sitting on a digital plot of land that exists only as data on a server somewhere. They spend hours on it — maybe weeks. They choose the materials carefully. They place the windows where the light comes in right. They build a little garden out back, because the garden makes it feel complete. The house is not real. It cannot be lived in. It has no value on any market that deals in physical objects.

And when somebody griefs it — when some other player comes along and burns it down or blows it up for laughs — the person who built it feels a surge of anger and loss that is, by any honest measure, real. Not metaphorical. Not exaggerated. The feeling is genuinely comparable, in both quality and intensity, to the feeling of having something physical vandalized. They feel violated. They feel robbed. Some of them log off and don’t come back.

Every experienced gamer knows this. Most people outside of gaming dismiss it. But there is a growing body of research in psychology, neuroscience, and behavioral economics that says the gamers are right and the dismissers are wrong — and that the feelings people develop about virtual places, virtual objects, and virtual communities are not pale imitations of “real” feelings. They are the same feelings, running on the same psychological machinery, triggered by the same mechanisms. The virtual village feels like home because your brain is using the same hardware to process it that it uses to process your actual home.

I want to walk you through four pieces of that research, because they map almost perfectly onto four dynamics that make virtual communities work. And if you’re somebody who designs virtual worlds for a living — or wants to — understanding these dynamics is not optional. They are the difference between building a world people visit and building a world people belong to.

Continue reading Why Your Virtual Village Feels Like Home

The Four Pillars of a Mind

A scholarly look at why memory, personality, emotional intelligence, and motivation are the four things that make a character — or a person — feel real. And what cognitive science has to say about each of them.


The tavern keeper problem

Picture two tavern keepers. Both are characters in a game you’re playing, or in a novel you’re reading, or in an immersive world you’ve been invited to spend time in. Both pour you a drink, both take your coin, both say hello when you walk in.

The first one does nothing else. Every time you walk into the tavern, she gives you the same greeting. She doesn’t remember you. She doesn’t react to whether you saved her village last week or betrayed it. She has no opinions about the weather, no complaints about her back, no idea that the barrel of ale in the corner is cursed. She is, functionally, a vending machine for drinks wearing a person-shaped costume.

The second tavern keeper is also a character. Also pours drinks, also takes coin, also says hello. But she remembers that you helped her daughter recover from the fever six months ago, and her greeting is warmer because of it. She’s naturally cautious — when you ask about the cursed barrel, she weighs the question for a moment before answering, the way a cautious person would. She notices that you look tired tonight and pours you something a little stronger without being asked. And she wants something for herself, too, underneath all of this — she’s been saving up to buy out her brother-in-law’s share of the tavern, because she thinks she could run it better alone, and that ambition colors everything she does.

You know which tavern keeper is the memorable one. You also know which one is more expensive and time-consuming to build, whether you’re writing her as a novelist, scripting her as a game designer, or configuring her as an AI system. The question I want to walk through in this post is why. Why does the second one feel like a person and the first one doesn’t? What are the specific ingredients that have to be present for a character to cross the line from puppet into presence?

The answer, it turns out, is that there are exactly four of them. And they are not a designer’s preference. They correspond to four dimensions that cognitive scientists have been studying in humans for the last fifty years — four specific things the human mind uses to recognize another mind as being real. When you design a character who has all four, you’re not faking personhood. You are activating the parts of your audience’s brain that are already wired to respond to personhood, and those parts don’t care whether what’s in front of them is digital, printed, or physical.

I call these the Four Pillars. Let me walk you through each one, and the research that makes each of them load-bearing.

Continue reading The Four Pillars of a Mind

From Cute Little Helper to Civilizational Threat

How the public’s feelings about artificial intelligence changed more in five years than in the previous fifty — and why the next five years are ours to shape.


Remember when AI was adorable?

I want you to go back in your head to about 2015. If you had an Amazon Echo in your kitchen, you probably thought of Alexa as a friendly little helper. You said “Alexa, what’s the weather” and she told you. You said “Alexa, play some jazz” and she did. When she misheard you — which was often — it was funny, not threatening. She was, in the cultural imagination of the mid-2010s, a charming household appliance. Something between a toaster and a butler. Nobody thought Alexa was going to take over the world.

Continue reading From Cute Little Helper to Civilizational Threat