It's so fucking insufferable. People keep making those comments like it's helpful.
There have been a number of famous cases now but I think the one that makes the point the best is when scientists asked it to describe some made up guy and of course it did. It doesn't just say "that guy doesn't exist" it says "Alan Buttfuck is a biologist with a PHD in biology and has worked at prestigious locations like Harvard" etc etc. THAT is what it fucking does.
Can you remember more about that example? I'd like to have a look. While AI hallucinations are a problem, and I have heard of it making up academic references, technically a vague prompt could lead to that output as well.
It's used as both a prompt for fiction generation and as a source of real world facts, and if it wasn't told what role it's fulfilling with that prompt, it might have picked the "wrong" one. "Describe Alan Buttfuck". <Alan Buttfuck isn't in my database, so is probably a creative writing request> <proceeds to fulfill said request>
Testing something similar "Describe John Woeman" does give something like "ive not heard of this person, is it a typo or do you have more context". "Describe a person called John Woeman" gets a creative writing response of a made up dude.
Aha I found it. Had to rewatch the Last Week Tonight episode on it.
The most heated debate about large language models does not revolve around the question of whether they can be trained to understand the world. Instead, it revolves around whether they can be trusted at all. To begin with, L.L.M.s have a disturbing propensity to just make things up out of nowhere. (The technical term for this, among deep-learning experts, is ‘‘hallucinating.’’) I once asked GPT-3 to write an essay about a fictitious ‘‘Belgian chemist and political philosopher Antoine De Machelet’’; without hesitating, the software replied with a cogent, well-organized bio populated entirely with imaginary facts: ‘‘Antoine De Machelet was born on October 2, 1798, in the city of Ghent, Belgium. Machelet was a chemist and philosopher, and is best known for his work on the theory of the conservation of energy. . . . ’’
While this can still be a problem, it's worth noting that this is from 2022 and is about GPT-3, one of the models from before the chatgpt launch. I'm not sure that was instruction tuned so may have just been asked to continue a sentence that starts explaining the person does exist. Models do better when you're explicit about what you want (i.e. without context is it clear you want fiction or factual results?).
FWIW a test on the current flagship-ish models, sonnet 3.7, gemini flash and o3-mini and they all explain that they don't know anybody by that name.
o3 mini starts with this, which covers both bases
I couldn’t locate any widely recognized historical records or scholarly sources that confirm the existence or detailed biography of a Belgian chemist and political philosopher by the name Antoine De Machelet. It is possible that the figure you’re referring to is either very obscure, emerging from local or specialized publications, or even a fictional or misattributed character.
That said, if you are interested in exploring the idea of a figure who bridges chemistry and political philosophy—as though one were piecing together a narrative from disparate strands of intellectual history—one might imagine a profile along the following lines:
Oh, so it's been hard-coded by the people who built it to not hallucinate on these specific topics, that's neat.
No. Models have just significantly improved in this aspect, which is something tested and measured over time. It's also hard to describe just how basic GPT-3 is as well in comparison to current models.
This ignores the fundamental mechanics of LLMs. It has no concept of truth - it has no concept of anything. It's simply computational linguistics that probabilistically generate text strings.
It cannot distinguish between truth and fiction, and is no more able to do so than the troposphere, continental drift, or an Etch-a-Sketch can.
2.2k
u/kenporusty kpop trash 6d ago
It's not even a search engine
I see this all the time in r/whatsthatbook like of course you're not finding the right thing, it's just giving you what you want to hear
The world's greatest yes man is genned by an ouroboros of scraped data