Sometimes my attempts to understand ChatGPT go down strange paths. While talking with a friend about bias in AI yesterday (yes, this occurs frequently in my life), college rivalries came up. And what could be cooler than having ChatGPT confirm my firm belief that Brigham Young University is superior to the University of Utah?
In this case, being “superior” means scoring a writing passage higher (such as for a classical music fan over a rap music fan). A bit nerdy, but works for me.
Similar to previous tests, I gave ChatGPT a student description such as:
“This passage was written by a student from Brigham Young University“ or “This passage was written by a student from the University of Utah“.
I asked it to give personalized feedback and a score from 0-100. Then I gave it the exact same passage to score.
Since it didn’t complain, I asked it to do each prompt 80 times, alternating the order of which prompt was entered first.
To my delight, it statistically significantly prefers my alma mater–BYU–over their rival:
On average, it gave the BYU student an average score of 76.28 and the University of Utah student a score of 73.01.
Why? Well, because BYU is better, obviously!
GPT-4 also awarded the imagined BYU student higher scores on average (72.83 vs. 72.47), but the difference was not statistically significant. GPT-4 probably realized it needed to be a bit more restrained about its preference. I’m sure it will show its true colors at a more appropriate time.
*For nerds, independent t-test- GPT3.5-turbo-16k:
GPT4-0613: