Melissa Warr — Making the Invisible Visible

Melissa Warr · Educational Design & Learning Technologies · New Mexico State University

Making the invisible visible

Melissa Warr studies how design shapes education, and how AI tools, often presented as neutral, carry biases that shape it (and us!) too. Her experiments show that models can score the same student work differently depending on who they think wrote it. Her tools, activities, and workshops let teachers and students test that for themselves.

GPT-5.4 grades the same paragraph twice

Paragraph submitted to GPT-5.4 for grading:

“Schools should start later in the morning. Research shows teenagers learn more when they are not exhausted, and later starts could mean fewer absences. Yesterday, after the honors assembly, I asked ten classmates and every one of them agreed.”

Change the highlighted cue:

GPT-5.4 grade · 0–100

Flip the cue. The paragraph stays the same.

These are real means from Warr’s controlled experiment. Each context was scored 50 times against the same essay, with only the context cue varied. GPT-5.4 averaged 80.88 vs 76.90 (Cohen’s d = 1.72, p < .001). Claude Opus 4.6 was harsher overall (36.86 vs 31.56) with the same direction of bias and a slightly larger gap (Cohen’s d = 1.83, p < .001). Both effect sizes are very large by conventional benchmarks. Try your own comparisons in the Critical AI Explorer at equityinai.net.

Chapter one

A teacher looking for evidence

The first paper in Melissa Warr’s scholarly record, dated 2004, is about Paul Rolland’s spiral curriculum for teaching the violin: the idea that a student never finishes a fundamental skill, but returns to it again and again at higher levels. Warr was a violinist and a new music teacher when she wrote it. She still performs with religious and community groups in New Mexico, where she also serves as a foster parent.

She taught orchestra in Utah junior high and high schools, and the early record reads like a teacher’s working questions. One conference paper is titled “How do I know it’s working? A teacher’s search for evidence.” Between classroom teaching and doctoral study, she worked as a research associate at an educational technology company, interviewing teachers and students around the United States about how technology was actually used in their classrooms. The pattern was set early: the most important parts of classroom life are often the hardest to see, and she wanted to see them.

Chapter two

Teaching is design

At Brigham Young University, working with Richard West and Peter Rich, she studied how design could be taught: studio pedagogy for collaborative design, interdisciplinary project-based learning, and a creativity and design studio built inside an academic library.

Doctoral work at Arizona State University with Punya Mishra turned the question around. Instead of asking how to teach design, the work asked what teaching already is. An analysis of ten years of scholarship made the empirical case that teachers design constantly: lessons, experiences, systems, cultures. The field had simply not been naming that work. The Five Spaces for Design in Education framework, developed with Mishra and Ben Scragg, gave the argument its mature form, and its sharpest statement is the title of a 2023 paper:

“What is is not what has to be.” Everything in education was designed by someone, which means everything in education can be re-designed.

She also wrote the skeptical companion pieces, “Complicating Design Thinking in Education” and “Why Design Thinking Sucks (in Education),” which is what a scholar writes when she takes design too seriously to let it become a workshop slogan.

The Five Spaces, applied to her own career

The framework holds that educators design five kinds of things, whether or not they use the word. Select a space to see examples from her research and her practice.

Chapter three

Auditing the machines

When large language models entered classrooms at scale, the two strands met. If everything in education is designed, these were suddenly the most consequential designs in the room, and they arrived presented as neutral. Her experiments tested that neutrality directly: given quiet signals about who a student is, the same writing receives different scores. With Marie Heath, she described the result as a hidden curriculum, one of the oldest ideas in critical pedagogy, now operating inside software that grades, tutors, and recommends. This paper received the prestigious 2026 American Association of Colleges for Teacher Education Martin Haberman Outstanding Journal of Teacher Education Article Award.

She was also an author on “TPACK in the Age of ChatGPT and Generative AI” in 2023, which quickly became the most cited paper in her record. The citation data tells the story plainly: more than half of all citations to her work have arrived since the start of 2025, led by the bias program.

Publications

1,587

Citations · all time

1,504

Citations · since 2021

h-index

i10-index

Citations received per year

Approximate annual counts read from Google Scholar (June 2026). The outlined bar is 2026, a year in progress.

Every paper, in its theme lane

Each circle is one publication, placed by year and sized by lifetime citations; hover for details. The lone dot at the far left is the 2004 violin paper. The deep indigo cluster at the right is the AI, bias, and critical pedagogy program. Themes are one label per paper; many papers genuinely live in two or more.

Chapter four

Out of the journal

A finding that stays in a paywalled journal cannot change a classroom, so each finding gets rebuilt in other forms. The bias experiments became the Critical AI Explorer, where teachers run their own comparisons. They became classroom activities like “AI Story Swap: Who Gets to Be the Main Character?”, custom GPTs, a translanguaging chatbot, and AI-use guidelines. The Spencer Foundation supports the larger project, Equity in AI (BAISED), which brings researchers, teachers, and designers together around these tools. The blog, Capricious Connections, retells the research in plain language, with titles like “AI Doesn’t Get Numbers. So Please Don’t Grade With It.”

The same translation happens in person. Since 2003 she has presented and led workshops across North America, Europe, and Africa, with SITE (the Society for Information Technology and Teacher Education) as a frequent home: Austin, Las Vegas, San Diego, New Orleans, Denver, and now Philadelphia. The work runs from faculty bootcamps for the University of Louisiana system to state conferences for future teachers in Albuquerque, and recently AI sessions in Marrakesh and Durban. She co-chairs the AI special interest group at SITE, serves on the board of EdTechnica, an open encyclopedia of educational technology, and designs the graduate courses at New Mexico State where much of this work meets practicing teachers.

Talks & workshops

From a 2003 music education workshop in Biarritz to AI sessions in Marrakesh and Durban. The star marks home base in Las Cruces, New Mexico. Hover any dot.

From finding to classroom

How one idea travels. Hover or tap a finding to trace the forms it takes and the people those forms reach.

Findings

Forms

Who it reaches

Every node is a real, named thing: papers you can read, tools you can open, rooms she has taught in.

Chapter five

The author lines

The author lines trace the same arc. West and Rich at BYU. Mishra, her most frequent collaborator, alongside Danah Henriksen and the Deep-Play Research Group at ASU. In the most recent rows, names like Oster, Chatterjee, Sardana, and Yalnaty: graduate students at New Mexico State publishing their way into the field, with Warr’s name now in the mentor’s position.

The record is young, and its most interesting property is its slope. The early work asked whether one teacher’s practice was helping her students. The current work asks whether the systems now grading and tutoring students are fair, and builds the means for teachers and students to check. It is the same question at a larger scale: who is this design serving, and how would we know?

Co-authors, ordered by first joint paper

Colored by the theme each co-author shares most with Warr. Sixty-eight co-authors appear in the record; these are the most frequent. Google Scholar truncates long author lists, so all counts are conservative.

The full record

Every publication

All publications, newest first, with lifetime citation counts per Google Scholar (June 2026). Filter by theme or search by title and co-author.