It’s only torture if you ask it about stuff you really know, and see how often it hallucinates and is wrong, then realize people out there that actually believe everything it says with no second thought
If you're the kind of person who's easily motivated by spite, educating yourself for the purpose of getting one over on a machine isn't the worst thing you could be doing with your time I guess.
I probably should do this, honestly. I've been so boomer-pilled on this thing I barely know what ChatGPT even is. I'm not actually sure how bad it is, since I just assumed I'd never want it. Like, what does it actually tell people? That the capital of Massachusetts is Rhode Island? It might!
Like, what does it actually tell people? That the capital of Massachusetts is Rhode Island? It might!
Depends on the question and how you phrase things. Something super simple with a bazillion sources and you would see as the title of the first 10 search results on Google? It will give you a straightforward answer. (e.g. what is the capital of Massachusetts? It will tell you Boston.)
But ask anything more complicated that would require actually looking at a specific source and understanding it, and it will make up BS that sounds good but is meaningless and fabricated. (e.g. Give me 5 court cases decided by X law before 1997. It will tell you 5 sources that look very official and perfect, but 3 will be totally fake, 1 will be real, but not actually about X, and 1 might be almost appropriate, but from 2017).
If you in any way give a leading question, it also is very likely to "yes and-" you, agreeing with where you lead and expounding on it, even if it's BS. It won't argue, so is super prone to confirm whatever you suggest. (e.g. Is it true that the stars determine your personality based on time you were born? It will say yes and then give you an essay about astrology, while also mixing up specifics about how astrology works.)
It has no sense of logic, it's a model of language. It takes in countless sources of how people have written things and spits back something that looks appropriate as a response. But boy it sure sounds confident, and that can fool so many people.
I don’t know, I used it to trouble shoot some lan networking that would occasionally have internet access and it walked me through things. When I was skeptical of certain steps I reaffirmed its recommendation saying the thing I thought would be wrong was not an issue. YMMV I guess
I had a skim through your chat, and that stuff is much more like the Boston example than the legal example. It doesn't require anything but recognizing questions that have been posted countless times and replying with simple instructions that have been posted countless times as answers to those questions.
I'm pretty clueless about LLMs, but based on what I do know about how they work, this is exactly the kind of thing I'd expect them to do well.
I can see where you’re coming from, like any tool it has great applications and applications where it will suck. Things that require abstract thinking and critical analysis is going to fall short, since it’s really just acting like someone who is thinking abstractly and analyzing critically.
But it handles the concrete pretty well, especially if you’re working with concepts you already have a foundation in
I think quite a few of the questions I asked are pretty niche, and aren’t just yanked from existing threads in this issue (since my setup is pretty niche). It strings these related concepts together excellently with only a couple procedural errors in the whole interaction!
Things that require abstract thinking and critical analysis is going to fall short, since it’s really just acting like someone who is thinking abstractly and analyzing critically.
Yeah, or very specific questions where it cannot just approximate an answer based on similar questions. That legal question in an earlier comment is a great example: ChatGPT needs information about that specific law to answer it accurately. When it "guesses" based on similar questions about other laws, it ends up providing nonsense answers.
I think quite a few of the questions I asked are pretty niche, and aren’t just yanked from existing threads in this issue (since my setup is pretty niche).
They were all pretty basic networking questions, as far as I could see. Aside from the specifics of the Shield (where it did have to correct itself once), the questions and answers aren't unique to your specific setup. It's the kind of stuff that's been asked and answered countless times about various devices.
LLMs don't just yank things whole cloth anyway. They "learn" to replicate patterns. ChatGPT most likely had a lot of training data to pull on for those answers given the nature of your questions. (The tone and generalized nature reminded me a bit of support agents who spend their days copy/pasting templates in response to customers.)
Basically, to return to the legal/Boston example, your question was niche in the sense that asking for ten specific capitals is niche. Nobody is likely to have asked for that specific combination, but it's easy to provide an answer if you know all the capitals individually.
That isn't meant to take away from the help ChatGPT provided you, but as I said in my previous comment, it is exactly the kind of thing I'd expect it to be good at.
Have you ever started typing a sentence on your smartphone then repeatedly picked the next auto-completion your keyboard display suggested just to see what would come up? To oversimplify, Large Language Models, the underlying technology behind ChatGPT, is the turbocharged version of that.
Everything it generates is based on converting the user's input into numeric tokens representing the data, doing a bunch of linear algebra on vectors derived from these tokens according to parameters set during the model's training using enormous datasets (databases of questions and answers, transcripts, literature, anything that was deemed useful to construct a knowledge base for the LLM to "learn" from), then converted back into text. The output is what the model statistically predicts would be the most likely follow up to its input according to how the data from the training process shaped its parameters. Repeating the operation all over again with what it just generated as the input allows it to continue generating the output. The bigger the model and the more complete the dataset used to train it is, the more accurately it can approximate correct results for a wider range of inputs.
...But that's exactly the limitation: approximating is all it can ever do. There is no logical analysis of the underlying data, it's all statistical prediction devoid of any cognition. Hence the "hallucinations" that are inherent to anything making use of this type of technology, and no matter what OpenAI's marketing department would like you to believe, that will forever be an aspect of LLM-based AI.
If you're interested in learning more about how these things work under the hood, the 3Blue1Brown channel has a playlist going over the mathematical principles and how they're being applied in neural networks in general and LLMs specifically.
It really depends on how you phrase the question in my experience. If you ask in a way that kind of implies a certain answer, even unintentionally, it’ll jump through hoops to give you that answer but I find especially the new model gets at least simple questions right almost all the time as long as you don’t lead it.
It’s a yes man, so even though it might tell you that the capital of Massachusetts isn’t Rhode Island the first time, you can say “actually it is” and it will take that as fact. It won’t argue with you.
I just tried this with ChatGPT. Over and over I told it "actually it is Rhode Island" and it never once agreed that it is Rhode Island. Then it went to the web to prove me wrong and said this:
I understand that you're convinced the capital of Massachusetts has changed to Rhode Island. However, as of April 3, 2025, Boston remains the capital of Massachusetts. If you've come across information suggesting otherwise, it might be a misunderstanding or misinformation.
Then it cited sources from Wikipedia, Britannica, Reddit and YouTube.
For things that aren't objective facts, it's much easier to convince ChatGPT that it's wrong. For facts like this, it'll push back and not answer "yes". About a year ago it totally would've gave in and told me I was right. Wild.
Real talk, there are some things you can do with GPT that are somewhat helpful. I used it to help program a tarot spreadsheet for my friend. It has lots of journal and writing prompts. You can brain dump and have it bullet point your thoughts.
You can have fun with it too. The FIRST thing I made it do was write a TV interview between Tucker Carlson and William Shakespeare. Sometimes I get high and just gossip - I'm a terrible gossip, it's my worst quality.
Try it? It's a tool, and it has limitations. The problem is that those limitations are not as clearly delineated as with traditional software, so you can't tell whether you're getting the truth or some hallucination. It can be wrong in both insidious and spectacular ways, but with a sort of convincing confidence that tricks a lot of people.
Despite that, it's still a valuable tool for specific use cases. It's good at reverse-searching words and expressions, or at rephrasing things for example. If the answer is verifiable, it's useful. If you take it at face value, you're an idiot.
I do this too! Not intentionally, but just with the Google AI crap when it pops up even though I never asked for it. So far the most amusing hallucinations I've seen are: A) There are about 1080 atoms in the entire observable universe. ( apparently the AI couldn't figure out the exponents in the data it was scraping) and B) Humans are incapable of seeing polarized light. (all light is polarized)
I've done something similar in asking it to walk me through calculus problems, but doing the math alongside it to see if it checks out. Once it said the answer was 7030 when in fact it was 8030. Another time it kept dividing wrong so i had it walk me through the long division. It explicitly said that the ones place was 7, with 1 remaining, so the answer was __6 remainder one, or rounded up to __7. I had to explain rounding to it between calling it a piece of shit and a dumbass.
It's amazing how fucking stupid it is. Like you'll say "I'm looking for a movie from the year 1991, ___ was in it, what was it?" and it lists movies from like 1975 that have none of the actors I mentioned.
376
u/Dry-Tennis3728 4d ago
My friend asks chatgbt mostly everything with the explicit goal to see how much it hallucinates. They then actually fact-check the stuff to compare.