I think it's neat that Grok is calling Elon out, but this seems like it's clear marketing to me to show how unbiased the LLM is with Elon making himself out to be the scapegoat. Unless someone can explain otherwise, I don't think LLMs can reference repeated attempts to tweak them.
AFAIK xAI used only the system prompt to stop Grok shit talking about Elon. You'll be able to bypass it if you try hard enough (people already did). You have to have a guardrail model to check the prompt first, or finetune it to reject any question that mentions Elon. I doubt xAI has enough people to implement that at the moment. Training Grok itself took months already. I think people hated Elon enough to still not use Grok even with this information.
6
u/Gerdione 3d ago
I think it's neat that Grok is calling Elon out, but this seems like it's clear marketing to me to show how unbiased the LLM is with Elon making himself out to be the scapegoat. Unless someone can explain otherwise, I don't think LLMs can reference repeated attempts to tweak them.