How does Pangram actually detect AI?
The biggest problem with AI-generated slop is not that it is particularly sloppy—but you and I are, and that’s good news.
If you are concerned about AI-generated writing on Substack—or if you are doing it and are worried about being caught—listen to this podcast of a conversation between Atlantic staff writer Charlie Warzel and Max Spero, co-founder and CEO of the AI-detecting tool Pangram. I learned how Pangram works and what that means for us. “Us” being writers who actually write and are concerned about the harms of AI-generated writing.
My aim here is to help people understand how Pangram works so they can be more critical thinkers about AI detection and know when they are being misinformed by Substack pundits and AI apologists who don’t know what they are talking about.
As a data science geek, I was fascinated by the technical details of how Pangram works. If you enjoy reading data science gobbledygook as much as I do, read for yourself how Pangram explains how it works here.
More important, it has left me more confident that the increasing number of AI takedowns, like in Grantagate, are based on accurate detection. If you have not seen Lincoln Michel’s exceptionally good coverage of Grantagate, take a look.
There is also a hopeful message in this ongoing discourse about the humanness of writing and why AI can’t reproduce it.
That helps to distinguish AI slop from simply bad human writing, which is incompetent in similar ways to AI slop but is still humanly imperfect. I can’t offer you a study to prove it, but my gut feeling—from reading a lot of bad writing over my career and a year of reading AI slop on Substack—is that even the worst human writer still sounds more human than the best AI-generated slop.
Caveat emptor
I’d prefer that you read the following caveats, but you can also just scroll down to the “How it works” section.
Speaking as a science journalist, I perhaps should say “I listened to a podcast that told me how Pangram says Pangram works and what Pangram says it means for us.” If I were reporting this for real, I would have asked Spero for peer reviewed journal literature about Pangram’s accuracy and training methods, then interviewed him based on that. And after writing this article, I would get Spero on the phone to check every claim that I am making. Because if I got it wrong, my for-real editor would say, “JunkMan, no more assignments for you, buddy.”
But I’m not going to do that because: 1) they probably wouldn’t talk to me because I don’t work for a prestige publication like the Atlantic and I don’t have a large following on Substack, and therefore very few people will actually read this and 2) no one is paying me to do this. (If you see this, Max, and you would be willing to be interviewed, I would love that.)
The take-home message is, no, I’m not a Pangram shill. This article is part explanatory science journalism and part editorial, because I loathe AI slop. I welcome and encourage evidence-based and civil commentary and corrections. If there are any, I will post them.
How Pangram works
I had thought Pangram was a simple pattern detector that searches for AI tells, like the notorious X/Y flip and using the word “delve” in awkward ways. But Pangram is based on machine learning, not simple AI-tell-hunting.
Pangram’s model is trained on about a million documents written by humans. In the training, Pangram prompts large language models (LLMs) like ChatGPT to produce equivalent documents to the human-generated ones, which Pangram calls synthetic mirrors. These are used to train the model how to classify text as AI-generated based on its style, tone, and meaning.
Pangram’s training also incorporates cases where the AI detection is uncertain and then recreates the model with those edge cases added to the training set. That’s important because the edge cases are where the model could make an incorrect classification. Spero says Pangram rebuilds its model from scratch every three to six weeks to keep up with the evolution of LLMs.
How accurate is Pangram?
I want to address the lame objection to AI detection that no one can really be sure that a piece of writing is AI-generated. This is not valid. Anybody with a modicum of editorial skill and discernment can tell right away when they are reading slop. Just because you can’t do it doesn’t mean we can’t do it. (If you don’t believe me, just ask Sam Kriss.)
But assuming the claims Pangram makes about its tool are accurate, the odds are now pretty good in favor of reliable AI detection. Also consider that there is increasing objective evidence that AI detection has reached a level of trustworthiness that justifies it being deployed more widely.
Let’s quantify the trustworthiness. What is Pangram’s false-positive rate? That means how frequently Pangram misclassifies human writing as AI-generated. Spero says Pangram’s false-positive rate is 1 in 10,000.
What does that mean in the real world? Pangram’s reported false-positive rate is lower than that of some diagnostic medical tests people trust. For example, if you take an HIV screening test, there is a small chance the result will be a false positive. The rates vary between tests, but a rough estimate for initial screening is about 1 false positive in 1,000 tests, which means Pangram’s reported false-positive rate is roughly 10 times lower. But you don’t often hear people say, “You know, I don’t trust HIV tests. You can’t be sure.”
You can be as sure as the evidence and statistics allow. Nothing is 100% in this world except death and AI slop on Substack, and 1 in 10,000 is really good.
I can just hear someone out there thumb-tapping away at a comment saying they ran their writing through Pangram and were told it contained AI-generated text. This reflects a lack of understanding of statistics.
If 10,000 people run their human-written work through Pangram, on average, one of them will be falsely flagged as AI-generated. Not good if it’s you! But at the same time, 9,999 documents will not be falsely flagged.
(Note for purists: There will not be a false positive in every run of 10,000 documents. Sometimes it will be zero; sometimes there will be more than one. But on average across many tests, the rate is 1 in 10,000. And rates vary somewhat between different kinds of writing, but I’m trying not to melt any brains here.)
So, going back to the HIV test example. Have you ever gotten a false-positive HIV screening test result? It’s not fun, but did you conclude from that that HIV tests are trash and that they can’t really tell if you have been infected? Of course, no.
The humanity of human writers
The thing I love in the podcast is the way Spero characterizes writing as a series of micro-decisions that we make to reach the end result. There are myriad branching points in that decision tree where the writer could have gone one direction or another—like where you chose one word instead of another.
The sum of all those decisions gives me/you a uniquely human voice and style, and one that varies a lot from person to person. Humans are inconsistent and quirky. We do things one way in one sentence and something else in another. This sometimes annoys our readers and copy editors, but it makes us sound human.
But as Spero explains, LLMs “tend to make the same choices over and over. Experts call this mode collapse.” In other words, an LLM’s decision tree has fewer branching paths. LLMs congeal around common patterns. They are predictable. Bland. Monotone. Boring.
That helps to distinguish AI slop from simply bad human writing, which is incompetent in similar ways to AI slop but is still humanly imperfect. I can’t offer you a study to prove it, but my gut feeling—from reading a lot of bad writing over my career and a year of reading AI slop on Substack—is that even the worst human writer still sounds more human than the best AI-generated slop.
Gut feeling! How emotional. A jumble of supposition and overgeneralization. But how human. How messy. And one of the precious aspects of our humanity that widespread AI-assisted writing continues to erase.
Your AI-generated writing is shitty—and we can tell.



Nice reflection, as always, Junkman! I appreciate your qualifications arriving at just the moment in which my annoying voice is saying, “But actually …”