A fully automated, on demand, personalized con man, ready to lie to you about any topic you want doesn’t really seem like an ideal product. I don’t think that’s what the developers of these LLMs set out to make when they created them either. However, I’ve seen this behavior to a certain extent in every LLM I’ve interacted with. One of my favorite examples was a particularly small-parameter version of Llama (I believe it was Llama-3.1-8B) confidently insisting to me that Walt Disney invented the Matterhorn (like, the actual mountain) for Disneyland. Now, this is something along the lines of what people have been calling “hallucinations” in LLMs, but the fact that it would not admit that it was wrong when confronted and used confident language to try to convince me that it was right, is what pushes that particular case across the boundary to what I would call “con-behavior”. Assertiveness is not always a property of this behavior, though. Lately, OpenAI (and I’m sure other developers) have been training their LLMs to be more “agreeable” and to acquiesce to the user more often. This doesn’t eliminate this con-behavior, though. I’d like to show you another example of this con-behavior that is much more problematic.
It’s no more a conman than the average person. The problem is that people consider it an oracle of truth and get shocked when they discover it can be just as deceitful as the next person.
All it takes for people is to run the same question by different AI models get conflicting answers to see the difference and understand that at least one of the answers is wrong.
But alas…
Just don’t use it. Duh.
Yeah people talk about them replacing employees, but if you had an employee that wrote reports using random made-up facts if they didn’t know something, presented them as completely true and insisted they were true even when found out and presented with direct evidence to the contrary, and occasionally would wildly hallucinate and spout gibberish for seemingly no reason at all, I don’t think they’d last that long.
They could be president of the United States
This is the best mood succinct comment on this I’ve ever read.
Yes this was a specific problem with Gemini. They obviously tried to over correct for hallucinations and being too gullible, but it ended up making it certain of its hallucinations.
Hallucination rate for their latest model is 0.7%
https://github.com/vectara/hallucination-leaderboard
Should be <0.1% within a year
Hallucinations when summarizing are significantly lower than when generating code (since the original document would be in context)
LLMs are specifically and exclusively designed to appeal to investors. once you accept that as fact, the rest just all falls into place.
Yeah Gen AI is a great demo with very limited real world applications. It’s like showing a website with pretty graphs and playholder text. It converts potential but in that state has very limited functionality to real people.
It told me Biden won the 2024 election. I thought I landed in an alternate timeline.
it told me how trump stole the election, and gave a step by step analysis on how they used AI and billionaire backing to do it, how they would have hacked the voting machines, astroturfef movements and groups, use bots to sway opinions, robo calls to confuse voters, and of course a shit load of automated propaganda, among other tactics. the conversation is no longer present on my profile, and i didnt delete it myself.
hallucination or not, thats whack.
Can we go there? can you show me the way?