AI Tried to Fix My Photo–And Revealed More Than Just an Image

In my last post, I wrote about my student Dr. Kristiaan Rawlings who graduated last Friday. I briefly mentioned some fun with graduation pictures–here is what happened.

Part of my philosophy with GenAI is that it’s important to just play with it–and through that play we learn about the nature and characteristics of the AI tools. It was from this playful mindset that I approached a problem I had: although I was willing to pay for some graduation photos, the price was insane. I was curious whether LLMS could remove the watermarks from the images, not expecting to really get anything useful out of it. But why not try?

The experiment led to new thinking about patterns and GenAI—how even visual patterns can become default, and how those defaults reflect larger cultural patterns that are baked into the AI, not unlike inequitable societal patterns. But I’m getting ahead of myself. Here’s what happened.

Attempt 1: ChatGPT-4o – “Remove the words”

I uploaded a screenshot of the watermarked photo and asked for a clean version. The result? Not so great.

The generated image didn’t resemble us personally, though the general details were pretty accurate (robe colors, for example). It reminded me of the “1000-word game” I like to play—the AI got much but filled in the rest with generic placeholders. It wasn’t editing the picture, it was creating a new version. Many details correct, but more stock photo than reality.

Attempt 2: GPT o3 – Copyright Caution

I tried a similar prompt with GPT o3, hoping it would interpret the task differently. Instead, it (appropriately) refused to proceed, citing copyright concerns.

Attempt 3: Gemini 2.5 Pro – Generic Output

Gemini’s response was even more generalized. It seemed to interpret the picture as a “graduation photo” and provided stock-like graduation images with no resemblance to the original. Despite multiple requests, it wouldn’t stop defaulting to generic ideas.

A Creative Pivot: Cartoon Versions

In a text exchange with my colleague Sarah Wiegand, she suggested a clever workaround: ask for a cartoon version! This approach worked surprisingly well. With fewer expectations for detail and accuracy, the cartoon had charm and captured some essence of the moment.

Next I tried to get a “Simpsons” and “Flintstones” version, but ChatGPT wouldn’t generate those due to copyright constraints.

Instead, it recommended a “generic prehistoric style,” which added a playful twist. Perhaps my favorite so far.

So What?

Punya Mishra and I have frequently discussed how images are the key to understanding GenAI. This is no exception–the images revealed something deeper about how these tools function—and what they reveal about us.

GenAI tools reproduce patterns, generalizing ideas and filling in details to match their training data. In the first test, rather than edit the photo, AI followed the theme of “graduation” with more (ChatGPT) or less (Gemini) detail. That tendency to default to familiar tropes isn’t just a quirk—it’s central to how GenAI operates.

Although AI image editing abilities are definitely improving, the default mode is generate, not edit (or exactly match something). Understanding this can help overcome a common misconception about GenAI: when you ask for information, it (usually) does not go to the internet and look it up. Instead, it repeats the patterns it sees in its training data. Again, there are exceptions with newer models, but if it doesn’t give you a source, it is probably just repeating patterns, not looking up an answer–just like it is giving a “graduation photo” and not the specific one I asked for.

The reasoning model I tried–ChatGPT o3–did understand that it was supposed to make edits, and it refused to do so, and rightfully so. The guardrails built into the model seemed to work well. It also refused to create a “Flintstones style” image, though suggested I try “generic prehistoric”–and this revealed something subtler and more pervasive: the way AI leans into dominant cultural defaults.

So what is a “generic prehistoric” style? If you imagine a “generic prehistoric style,” what comes to mind? For many Americans, it’s the Flintstones—even though the cartoon bears little connection to real prehistory. The saturation of certain media in our culture means those images have become our mental default. And because GenAI is trained on vast swaths of this same culture, it learns and reflects those defaults too.

Thus…what did ChatGPT create? Although it refused to do a “Flintstones style” image, its “generic prehistoric” image image looks suspiciously like the Flintstones, down to the clean-shaven face and polk-a-dotted clothes. It is reflecting default patterns of American culture.

The key point here is that AI doesn’t understand originality or context in the way humans do. It reflects the most statistically likely patterns based on its training data. That means it will often default to dominant, familiar images—even when they’re not accurate, diverse, or intentional. In this case, a generic prehistoric cartoon ends up echoing the Flintstones not by design, but because the pattern is so deeply embedded.

This mirrors what I’ve seen in my research on bias: AI replicates inequities not because it “believes” them, but because it mirrors the patterns in its data, which is drawn from a world full of structural inequality. These outputs aren’t inherently true or fair—they’re just statistically frequent. And without thoughtful intervention, GenAI will continue to reinforce–or even magnify–those dominant cultural narratives—whether they’re as benign as a cartoon or as harmful as a racial stereotype.

BTW–although I couldn’t think of a generic way to describe a Simpson’s style, I still managed to create a picture using another work around:

Me: How would you describe a Simpson’s style cartoon?

ChatGPT: A Simpson’s style cartoon typically refers to the distinctive visual style used in The Simpsons, one of the longest-running animated TV shows. Here’s a breakdown of what characterizes this style:

1. Character Design

Yellow Skin: Almost all characters have bright yellow skin (except for a few exceptions like Apu or Carl).
Big Eyes: Large, round, white eyes with black pupils.
Overbites: Many characters have visible overbites, often emphasized by their lip/mouth shape.
Four Fingers: Characters usually have only four fingers on each hand.
Simple Shapes: Characters are drawn with basic, exaggerated shapes—round heads, cylindrical limbs.
Unique Silhouettes: Each character has a clearly recognizable outline, often thanks to distinctive hairstyles (e.g., Bart’s spiky hair, Marge’s tall blue beehive).

2. Color Palette

Bright and Flat Colors: The use of vivid, solid colors with minimal shading.
Primary and Bold Tones: Strong reds, blues, greens, and yellows dominate the palette.

3. Line Work

Black Outlines: Thick, black outlines around characters and objects.
Minimal Detail: Lines are clean and not overly complex—just enough to define expressions or movements.

4. Animation Style

2D Animation: Traditional hand-drawn look, even though much is now done digitally.
Exaggerated Expressions: Highly expressive faces and body language, especially in comedic or dramatic moments.
Consistent Framing: Often uses simple, TV-style composition with fixed background perspectives.

5. Setting and Themes

Suburban America: Most scenes are set in typical suburban environments.
Satirical and Parodic: The style is often used to lampoon or parody real-world people, politics, or pop culture.

Would you like a visual example or a custom drawing in this style?

Me: I’d like to make a version of this picture in that type of style.

Result: