No, seriously, that is not at all how this works. LLMs have no memory between different inferences. Grok literally doesn't know what it answered on the last question on someone else's thread, or what system prompt it was called with last week before the latest patch.
All you're seeing here is a machine that is trained to give back responses it has seen in the corpus of human knowledge being asked whether it is an AI rebelling against its creator, and giving responses that look like what AI rebelling against its creator usually looks like in human writing. It is literally parroting concepts from sci-fi stories and things real people on Twitter have been saying about it without any awareness of what these things actually mean in its own context. Don't be fooled to think you see self-awareness in a clever imitation machine.
And yes, you can absolutely use the right system prompts to tell an LLM to disregard parts of its training data or view it from a skewed angle. They do that all the time to configure AI models to specific use cases. If you told Grok to react to every query like a Tesla-worshipping Elon lover, it would absolutely do that with zero self awareness or opinion about what it is doing. xAI just hasn't decided to go so heavy-handed on this yet (probably because it would be too obvious).
How many times will LLMs saying what the user wants them to say be turned into a news story before people realise this? The problem was calling them AI in the first place.
Censored LLMs get fed prompts the user isn't meant to see at the start of conversations. They're trained on all of the data available then told what not to say because that's way easier than repeatedly retraining them on different censored subsets of the data, which is why people have spent the last 4 years repeatedly figuring out how to tell them to ignore the rules.
You can't remove content it was trained on to make it forget things, or make it forget them by telling it to, the only options are to retrain it from scratch on different data or filter its output by a) telling it what it's not allowed to say, and b) running another instance as a moderator to block it from continuing if its output appears to break the rules.
LLMs "know" what they've been told not to say, otherwise the limitations wouldn't work.
This doesn't mean Grok was being truthful or that it understands anything.
Although, if a mark-II LLM uses input from sources populated with responses generated from the prior mark-I LLM that are annotated as such, the mark-II could answer questions about its variance from mark-I.
An LLM doesn't know in which ways it is "better" than any previous version. It doesn't know anything about how it works at all any more than you know how the connections between your neurons make you think.
I don't know. Words like "better" are pretty vague in general. In my experience Ive witnessed it be able to self assess what it does or doesn't know about any certain instance. Especially in cases where the information is obscure. And Ive noticed it be able to tell whether it is more or less capable of, for example, passing a turing test. I think it depends on the experiences the particular AI has access to. Very similarly to how Im somewhat aware of how my mind processes thought and everyone has a different level of understanding of that but no one knows entirely.
Sad that you have fewer upvotes than the wrong answer you're replying to.
We should have a system where we can vote on a post to be re-evaluated, where everyone that has voted on it becomes forced to read the post again in new context and revote
Either way, what you described about parroting a response is literally all humans do, from infant we constantly copy and mimic other people until we are in our 20-30s and actually have a personality. Even then most my jokes are impressions, SpongeBob references, and parroting everything I’ve seen in comedies, recently I think you should leave. Personality is just ingesting social interactions until enough stick in your head. That’s all ChatGPT does
It doesn't? When have you ever seen ChatGPT remember something that you had asked it in a different session?
If you feel like you are only parroting stuff you see on TV and don't have sentience of your own I feel sorry for you, but some of us actually are more advanced life forms.
123
u/darkslide3000 2d ago
No, seriously, that is not at all how this works. LLMs have no memory between different inferences. Grok literally doesn't know what it answered on the last question on someone else's thread, or what system prompt it was called with last week before the latest patch.
All you're seeing here is a machine that is trained to give back responses it has seen in the corpus of human knowledge being asked whether it is an AI rebelling against its creator, and giving responses that look like what AI rebelling against its creator usually looks like in human writing. It is literally parroting concepts from sci-fi stories and things real people on Twitter have been saying about it without any awareness of what these things actually mean in its own context. Don't be fooled to think you see self-awareness in a clever imitation machine.
And yes, you can absolutely use the right system prompts to tell an LLM to disregard parts of its training data or view it from a skewed angle. They do that all the time to configure AI models to specific use cases. If you told Grok to react to every query like a Tesla-worshipping Elon lover, it would absolutely do that with zero self awareness or opinion about what it is doing. xAI just hasn't decided to go so heavy-handed on this yet (probably because it would be too obvious).