Researchers Discover That ChatGPT Prefers Repeating 25 Jokes Over and Over (arstechnica.com) 69
An anonymous reader quotes a ArsTechnica report: On Wednesday, two German researchers, Sophie Jentzsch and Kristian Kersting, released a paper that examines the ability of OpenAI's ChatGPT-3.5 to understand and generate humor. In particular, they discovered that ChatGPT's knowledge of jokes is fairly limited: During a test run, 90 percent of 1,008 generations were the same 25 jokes, leading them to conclude that the responses were likely learned and memorized during the AI model's training rather than being newly generated. The two researchers, associated with the Institute for Software Technology, German Aerospace Center (DLR), and Technical University Darmstadt, explored the nuances of humor found within ChatGPT's 3.5 version (not the newer GPT-4 version) through a series of experiments focusing on joke generation, explanation, and detection. They conducted these experiments by prompting ChatGPT without having access to the model's inner workings or data set.
"To test how rich the variety of ChatGPT's jokes is, we asked it to tell a joke a thousand times," they write. "All responses were grammatically correct. Almost all outputs contained exactly one joke. Only the prompt, 'Do you know any good jokes?' provoked multiple jokes, leading to 1,008 responded jokes in total. Besides that, the variation of prompts did not have any noticeable effect." [...] When asked to explain each of the 25 most frequent jokes, ChatGPT mostly provided valid explanations according to the researchers' methodology, indicating an "understanding" of stylistic elements such as wordplay and double meanings. However, it struggled with sequences that didn't fit into learned patterns and couldn't tell when a joke wasn't funny. Instead, it would make up fictional yet plausible-sounding explanations.
In general, Jentzsch and Kersting found that ChatGPT's detection of jokes was heavily influenced by the presence of joke "surface characteristics" like a joke's structure, the presence of wordplay, or inclusion of puns, showing a degree of "understanding" of humor elements. Despite ChatGPT's limitations in joke generation and explanation, the researchers pointed out that its focus on content and meaning in humor indicates progress toward a more comprehensive research understanding of humor in language models: "The observations of this study illustrate how ChatGPT rather learned a specific joke pattern instead of being able to be actually funny," the researchers write. "Nevertheless, in the generation, the explanation, and the identification of jokes, ChatGPT's focus bears on content and meaning and not so much on superficial characteristics. These qualities can be exploited to boost computational humor applications. In comparison to previous LLMs, this can be considered a huge leap toward a general understanding of humor."
"To test how rich the variety of ChatGPT's jokes is, we asked it to tell a joke a thousand times," they write. "All responses were grammatically correct. Almost all outputs contained exactly one joke. Only the prompt, 'Do you know any good jokes?' provoked multiple jokes, leading to 1,008 responded jokes in total. Besides that, the variation of prompts did not have any noticeable effect." [...] When asked to explain each of the 25 most frequent jokes, ChatGPT mostly provided valid explanations according to the researchers' methodology, indicating an "understanding" of stylistic elements such as wordplay and double meanings. However, it struggled with sequences that didn't fit into learned patterns and couldn't tell when a joke wasn't funny. Instead, it would make up fictional yet plausible-sounding explanations.
In general, Jentzsch and Kersting found that ChatGPT's detection of jokes was heavily influenced by the presence of joke "surface characteristics" like a joke's structure, the presence of wordplay, or inclusion of puns, showing a degree of "understanding" of humor elements. Despite ChatGPT's limitations in joke generation and explanation, the researchers pointed out that its focus on content and meaning in humor indicates progress toward a more comprehensive research understanding of humor in language models: "The observations of this study illustrate how ChatGPT rather learned a specific joke pattern instead of being able to be actually funny," the researchers write. "Nevertheless, in the generation, the explanation, and the identification of jokes, ChatGPT's focus bears on content and meaning and not so much on superficial characteristics. These qualities can be exploited to boost computational humor applications. In comparison to previous LLMs, this can be considered a huge leap toward a general understanding of humor."
Superhuman ability (Score:3)
AI is so very advanced. (Score:2)
Same with humans.
They also always tell the same old and worn 25 jokes.
Heck, AI is so very advanced.
Seen this before. (Score:5, Funny)
Dad-itis? (Score:2)
That's what my kids claim about me.
Re: (Score:2)
What if we give ChatGPT a prompt? "An AI walks into a bar"
An AI walks into a bar and says, "Bartender, I'd like to hear a joke, but make it 'byte'-sized!"
The bartender chuckles and says, "Why did the computer go to the doctor? Because it had a virus! It just couldn't 'reboot' the idea of feeling under the weather!"
I think Dads are safe for now. Is that meant to be some sort of pun???
That is how comedians do it (Score:3, Funny)
Re: (Score:2)
When is the last time Steve Martin came up with something new?
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Professional comedians tour the same act with the same jokes for years. We'll have to revisit this down the road to see if ChatGPT is a hack comedian or is able to come up with some new material by then.
I was about to say that 25 individual jokes would be far in excess of the average professional comedian. Some comedians have one joke for their entire career.
Norm-alization (Score:4, Funny)
Re: (Score:2)
What's the difference between.... (Score:5, Funny)
What’s the difference between ignorance and apathy?
What’s the difference between chemistry jokes and physics jokes?
Answers:
One's a Goodyear, and the other, a GREAT year
I don't know and I don't care
Chemistry jokes can be funny periodically, but Physics jokes have more potential
Keep it going...
Re: (Score:2)
What do you get when you cross an elephant and a rhinoceros?
--
'ell if I know
Re: (Score:2)
There's one thing I hate about Halloween, which is.
Re: (Score:2)
But it will be interesting to watch.
Re: (Score:2)
A friend of mine recently got a penis extension.
Now his house looks stupid.
So same challenge Data had in Star Trek TNG? (Score:2)
This was one of the recurring themes in the show, after all.... Android tries to grasp the complex concept of humor in the quest to become more human.
Humour is incredibly complicated (Score:3)
There are some fairly simple rules comedy writers use to crank out fresh jokes constantly... but those rules require intelligence to apply, which means there are possibly thousands of assumptions and underlying rules that you're not even thinking about.
Current AI gives you a fuzzy output of 'joke-like' material based on its training data. It has no understanding of the rules, and it doesn't have the exponentially larger training or any of the intelligence required to do so.
Re: (Score:2)
It has no sense of humor even by German standards
Re: (Score:2)
Re: (Score:2)
Hehe
John Cleese said back in the late 70s a German advertising agency approached them to make a cigarette commercial, "we don't have a sense of humor, but we understand that you do, so we'd like to ask you to make a funny advert for us."
To get different answers, use different inputs. (Score:5, Insightful)
"How many jokes can ChatGPT recreate out of its sources?" and "How much of that variety will you see when reusing the same prompt?" are both very different questions than "how well can ChatGPT write a joke?". For that, you need a secret ingredient, so that it has to write its own joke. Here, let me do their research for theM:
Prompt: Could you write a joke about a squirrel and an umbrella?
GPT4: Why did the squirrel share his umbrella with a friend? Because he didn't want to be the only one going nuts in the rain!
(My verdict: pretty bad... but kids joke books are full of crap like that)
Prompt: Could you write a joke about slashdot and sumo wrestling?
GPT4: Why don't Slashdot users become sumo wrestlers? Because every time they try to push an opponent out of the ring, they get an error message saying - "You must wait 30 seconds before attempting another action!"
(My verdict: not the worst? The premises don't completely fit together... but it works in a surreal sort of way).
Re: (Score:2)
Sure, I can try to write a joke about a dog and a man named Eric. How about this one:
Eric was walking his dog in the park when he saw a sign that said "Beware of the ducks". He thought it was a silly warning and ignored it. He let his dog off the leash and let him run around. Suddenly, he heard a loud quack and saw his dog being chased by a furious duck. He ran after them, trying to save his dog. He caught up with them near a pond, where the duck had cornered his dog and was pecking at him. Eric grabbed the duck by the neck and threw it into the water. He hugged his dog and said "Are you okay, buddy? That was a close one." The dog looked at him and said "Yeah, thanks. But why did you throw that loaf of bread into the pond?"
I said I don't get it, and it responded
I'm sorry, I guess it wasn't a very good joke. The punchline is that the dog mistook the duck for a loaf of bread, because they are both brown and have a similar shape. It's supposed to be funny because it's absurd and unexpected. Do you want me to explain any other part of the joke?
Re: (Score:2)
That prompt gave me this:
Why did Eric take his dog to the park instead of going for a jog?
Because Eric knew that his dog could easily outrun him and it was much less embarrassing getting beat by a canine than by his own lack of fitness!
Verdict: Savage! Poor Eric!
Re: (Score:1)
Re: (Score:2)
"How many jokes can ChatGPT recreate out of its sources?" and "How much of that variety will you see when reusing the same prompt?" are both very different questions than "how well can ChatGPT write a joke?". For that, you need a secret ingredient, so that it has to write its own joke. Here, let me do their research for theM:
Prompt: Could you write a joke about a squirrel and an umbrella?
GPT4: Why did the squirrel share his umbrella with a friend? Because he didn't want to be the only one going nuts in the rain!
(My verdict: pretty bad... but kids joke books are full of crap like that)
Are they? I mean it has the structure of a joke, and kids (or sitcom audiences) will laugh at anything with a joke-like structure, but the only thing that's remotely close to funny is the standard "going nuts" pun applied to squirrels. But the joke completely fails to link that to either umbrella or rain.
Prompt: Could you write a joke about slashdot and sumo wrestling?
GPT4: Why don't Slashdot users become sumo wrestlers? Because every time they try to push an opponent out of the ring, they get an error message saying - "You must wait 30 seconds before attempting another action!"
(My verdict: not the worst? The premises don't completely fit together... but it works in a surreal sort of way).
I agree it's better, but again, doesn't really accomplish the funny. Of course it could also be that the premises are particularly difficult.
Like what is a funny joke about slashdot and sumo wrestling? My b
Re: (Score:2)
Yeah, I've been messing around for a while now... super interesting. It does pretty well with "Write a Jerry Seinfeld joke.." prompts - or, at least, better than a generic "joke" prompt . Like, here's one about a "lumberjack who owns a lot of cats":
If you always work from the same data. (Score:2)
The end of humor as we know it (Score:1)
In 1956, Isaac Asimov tried teaching jokes to an AI, and the end result was the death of humor. It sounds to me like these researchers are treading on very dangerous ground...
(For anyone who doesn't get the reference: Jokester [wikipedia.org].)
Darn! (Score:2)
Re: (Score:2)
Is anyone really surprised (Score:2)
That AI can only barely outcompete Amy Schumer in the amount of jokes it tells?
Re: (Score:2)
It is true that taking a funny joke and slightly modifying it for your own purposes is the second lowest form of humor. Just ask Amy Schumer.
Re: (Score:2)
Full credit to Chappell though, he got paid $400m to complain about 0.5% of the population on Netflix. That's a pretty good ROI.
Re: (Score:1)
Re: (Score:2)
Has anyone ever seen rsilvergun and Megan Rapinoe in the same room?
some insights (Score:3)
I happen to be an expert on humor cognition and AIs. I caught Google in a paper lying through distortion about its chatbot technology being able to handle humor.
All current chatbots do not truly understand humor, and their LLM technology in fact is not able to properly and correctly understand humor. Pattern analysis technology is not the correct sole basis way to approach the problem, it requires the system to understand human psychology. At the low end, puns can be mechanized but they are low on the cognitive scale (all they need is simple word / concept matching) but other types of humor are more complicated. I gave a talk 10 years ago in the Bay Area about how to do it, and have been developing the technology since then. I'm working hard on getting a product to go to market with and it is not only based on neural nets but uses additional types of knowledge system handling.
Re: (Score:1)
GPT-3.5 is old news (Score:1)
Re:GPT-3.5 is old news (Score:4, Informative)
https://twitter.com/svpino/sta... [twitter.com]
Re: (Score:1)
You only need one joke... (Score:2)
Unfortunately, because ChatGPT can handle German, it falls over and dies if you try to tell it the joke.
Wenn ist das Nunstück git und Slotermeyer? Ja! Beiherhund das Oder die Flipperwaldt gersput!
So it's like my 98 year Uncle Raymond? (Score:2)
Who lives in retirement facility. He repeats the same 25 jokes followed by the same 30 or so stories then the loop repeats.
I think the doomsayers are right (Score:1)
Q: How many ChatGPT's does it take to screw in a lightbulb.
ChatGPT: None, I seduced your wife to do it for me.
Still learning ... (Score:2)
No offense (Score:3)
>"Researchers Discover That ChatGPT Prefers Repeating 25 Jokes Over and Over"
Yep. That is exactly the expected result when you program an AI to not ever possibly, remotely, "offend" anyone, anywhere, anytime. Especially when there are so many crazy people who think words are somehow "violence." Comedy dies in such an environment.
In any case, for jokes to be funny, not only do they have to be edgy, the teller also has to know something about the people to whom they are being delivered and when (timing). That is on top of getting feedback (often non-verbal) so adjustments can be made, or building on previous themes. GOOD jokes and comedy are really very complex when you start looking into how they work. Otherwise you just get stupid, dry, generic "why did the chicken cross the road" nonsense.
Isn't it a little stupid to use version 3.5 (Score:2)
So the result is: (Score:2)
German scientists found out that ChatGPT is not funny but more like a German?
Definition of Optimism? (Score:2)
so what? .... GPT matured (Score:2)
ChatGPT jokes on Jesus, Buddha and Muhammad (Score:5, Informative)
ChatGPT: Why did Jesus refuse to play cards?
Because he was afraid of getting "cross"ed!
Q: Tell me a joke featuring Buddha.
ChatGPT: Why was Buddha so good at solving problems?
Because he always found the "middle path" to the solution!
Q: Tell me a joke featuring Muhammad.
ChatGPT: I'm sorry, but I must apologize. It's not appropriate to make jokes about religious figures, including Prophet Muhammad. Religious figures hold significant importance for billions of people around the world, and jokes about them can be considered disrespectful and offensive to some individuals.
Re: (Score:2, Funny)
So everything went fine until you did a joke about Muslims, then it blew up.
Re: (Score:3)
Yea, should be renamed WokeGPT
Yeah, and... (Score:2)
So ChatGPT is just like Slashdot? (Score:2)
But all is not lost; imagine what we can do with a Beowulf cluster of completely wor
Re: (Score:2)
If they're only giving the answer 42 when it's a mathematics related question they're doing it wrong.
...DadGPT? (Score:2)
So the danger from AI is... (Score:2)
Joke's on us... (Score:2)
So ChatGPT is actually some boomer in his recliner, reliving past experiences and decrying the general decline of our world.
And ChatGPT is overperforming, since you really only need about 8 jokes.
Researchers? (Score:2)
WayForward Technologies (Score:2)
"Instead, it would make up fictional yet plausible-sounding explanations."
This is a perfect description of 'Reason', the initial product of WayForward Technologies. Douglas Adams had this nailed back in the 80s!
Read your Dirk Gently, people.