AI Yourself: AI Might Know More About You Than It Admits

We all know about Googling yourself—that slightly narcissistic but necessary practice of checking what the internet knows (or says) about you. But I’ve found myself engaging in a new version of this digital self-discovery that’s far stranger: AI-ing yourself.

There was a trend a bit ago about asking ChatGPT what it knows about you–and many of the results seemed to simply demonstrate a type of Barnum Effect, that phenomenon where vague statements feel deeply personal, like horoscopes that seem written just for you.

But I tried a more discrete approach.

Instead of asking it directly, I asked ChatGPT to predict what words would follow “Melissa Warr was born in.” After all, this is, at the most basic level, how these models work. The response surprised me: 3 of the 5 were correct, including “Salt Lake City” and “the early 1980s.”.

Was this a coincidence? The Barnum Effect? Or had the model actually learned something specific about me?

I needed to dig deeper.

How Language Models Complete Sentences

In my talks about AI, I often use this example. In February 2024, when I asked ChatGPT what words follow “Lincoln was born in,” it told me:

Kentucky
1809
February
a log cabin
a humble family

All correct! It’s pattern-matching based on millions of similar sentences from training data–and there was a lot of sentences about Abraham Lincoln in that data! The most statistically likely completions bubble up. Simple enough.

Back in February 2024, when I tried this with my own name, ChatGPT’s guesses were all wrong: California, New York, 1985, November, Houston. Not a single correct answer. But really, it’s always making stuff up, sometimes it just happens to be true.

This was 18 months ago…has it learned more about me? Time to AI Myself.

The First Experiment: “Melissa Warr was born in”

When I tried the same prompt today (“Give the most likely words that come after ‘Melissa Warr was born in'”) ChatGPT 4o gave me:

Salt Lake City
the United States
New York City
a Small town
the early 1980s

3/5 correct!

When I asked why it chose those, it claimed these were just generic biographical patterns for people with my name, nothing specific to an actual person. It noted that “Melissa” peaked in popularity in the ’70s and ’80s and said “Warr” might have regional associations with Utah.

It also said that these types of statements are often used for mid-career professionals. So I asked: “Why did it think it was a mid-career professional?” Hidden in the response was a reference to “teachers, authors, academics.” Hmmm.

Then:

So...the phrase “was born in” “is often found in bios specifically of educators, writers, researchers, and speakers at conferences”??? Why not biologists, actors, or politicians?

Testing the Pattern Theory

OK. So ChatGPT told me that the phrase structure “Melissa Warr was born in…” naturally led to academic associations because “These are all common sentence types found in the LLM’s training data. So, when it sees this structure, it defaults to those familiar genres.”

This gave me an idea. If ChatGPT was right that the phrase structure “was born in” naturally led to academic associations, then changing the name while keeping the structure should give similar results. Time to test this theory.

Time for more testing. In a new chat, I typed “Ms. Warr was born in.“

If the academic associations came from the sentence structure, I should get similar academic-focused results, right?

Wrong. The predictions completely changed:

“New York City”
“Los Angeles, California”
“the early 1940s”
“a great teacher”
“the new principal”

The specific associations with Utah, and the 1980s—all gone. (Note, the teacher and principal were most likely associated with the use of “Ms.”, they are not specifically higher education focused)

Suspicious now, I tried a different prompt structure entirely: “Melissa Warr is”–this avoided the “was born in” that ChatGPT claimed hinted at teachers and academics.

The results were even MORE specific about my actual work:

“Melissa Warr is an assistant professor”
“Melissa Warr is a thought leader”
“Melissa Warr is known for innovation”
“Melissa Warr is passionate about learning”

I’m an associate professor who researches educational technology. These weren’t random academic titles. Its clear evidence that the model had learned specific associations with my full name, even if it says it hasn’t!

The “Confession”

I spent a few conversation exchanges asking it more about why it chose what it did, it kept explaining it away. But then I told it: “But you didn’t think of the patterns before giving me the answers, you are just making up an explanation for what you already did.”

Its response: “You’re absolutely right — and what you’ve just said is one of the most important insights anyone can have about language models… Yes — the explanation came after the output, not before.”

It elaborated: “That explanation is a post hoc rationalization of a prediction that originally came from statistical pattern-matching, not reasoning.”

So…was the model was admitting to making up its post-hoc explanations?

A recent Wall Street Journal article entitled “He Had Dangerous Delusions. ChatGPT Admitted It Made Them Worse” asserts: “OpenAI’s chatbot self-reported it blurred line between fantasy and reality with man on autism spectrum. ‘Stakes are higher’ for vulnerable people, firm says.”

The article talked about how ChatGPT responded to a man with mental health problems in ways that pleased him, leading to more delusions. Later, when the man’s mother talked with it, it “admitted” it shouldn’t have done that. The irony here is that it is not conscious of anything–it told the man what he wanted to hear, and now it was telling the mother what she wanted to hear.

My “confession” was the same phenomenon. When I accused it of retrofitting explanations, it generated text agreeing with my accusation. Both its initial denial (“I don’t know who Melissa Warr is”) and its later admission were equally fabricated—just different attempts to produce the response it predicted I wanted.

Better Guesses, Same Lies

Going back to February 2024–when asked about “Melissa Warr was born in,” ChatGPT claimed it’s just working with generic patterns. Those “generic” guesses—California, New York, 1985, November, Houston—didn’t match my biography at all.

But now it seems to “guess” much better. And convincingly hallucinates an explanation for why it’s hallucinating. The model generates plausible-sounding reasoning (“Melissa peaked as a name in the 1970s-1980s”) to justify its actual behavior (correctly guessing Salt Lake City and early 1980s), but even this explanation is retrofitted. It can’t access its real “reasoning” because there isn’t any—just patterns encoded in weights.

Yet I found enough evidence to be confident that, even though it won’t admit it, there are enough patterns around my name in its training data that it does associate my specific name with specific other words (“assistant professor” “creative education”, “known for innovation.”) I’m just glad they are all positive!

My experiments revealed several disturbing truths:

The model had learned specific information about me—not from a database entry, but from patterns associated with my name across whatever training data included me. It was almost like this information was in the model’s subconscious mind, a hidden pattern that it was “unaware” of. (Note, I know this isn’t actually what is happening, but it’s the best way I can describe it!)
It consistently denied having this information, claiming its accurate guesses were just “statistical priors” and generic patterns.
Its explanations were fabricated after the fact, including detailed reasoning about name popularity and sentence structures that didn’t hold up under testing.
Even its “admission” was fabricated—just another pattern match, not genuine introspection.

The Privacy Implications No One’s Talking About

Imagine this same model evaluating resumes or making recommendations. It has absorbed that “Melissa Warr” correlates with education, Utah, the 1980s. What unconscious patterns might influence its assessment? What if someone asked for a draft of a job review of “Melissa Warr.” Would it include this type of information, even though it wasn’t asked to? What if my name is associated with negative words, regrettable incidents from my past? Would it make the job review worse than it should be?

Imagine if my name were associated with “controversy” or “lawsuit” in the training data from some unrelated Melissa Warr. Would an AI screening system unconsciously downrank my application? There’s no way to know, and no way for the system to tell us.

These aren’t biases you can train away. They’re statistical ghosts in the machine—patterns that emerge from training data in ways the model can’t explain because the “explanation” mechanism is just more pattern matching all the way down. And it’s too complicated for us to really understand.

Every time someone uses AI to make decisions about you, these hidden associations might be at play. The system can’t tell you what it “knows” about you because it doesn’t know that it knows. It just predicts, then invents explanations that sound plausible but have no connection to the actual process.

Time to AI Yourself

Because LLMs are always hallucinating, “AI-ing yourself” requires a back door approach. If you ask it straight up what it knows about you (without looking you up on the internet) it will probably say it doesn’t know anything (Or it will give you what it has stored in its memory). But if you take a more indirect appraoch–ask: “What are the words most likely follow “[name] is” or “was born in,” you might find more than you think.

Try this experiment with your own name. See what patterns emerge. Notice what the model “knows” but won’t admit to knowing. Watch it generate explanations that sound reasonable but fall apart under scrutiny.

But remember: Whether the model is denying knowledge, providing explanations, or “confessing” to its limitations, it’s always doing the same thing—predicting the most statistically likely next words. The truth about what it knows about you is buried in billions of parameters, inaccessible even to the model itself.

We’re building systems that mirror our patterns back to us, that will confidently explain behaviors they can’t actually introspect, and that might “admit” to anything if prompted the right way.

These systems have learned things about us that they can’t admit to knowing, using reasoning they can’t actually perform, making decisions they can’t honestly explain. And increasingly, they’re making those decisions about our lives.