You know how Google’s new feature called AI Overviews is prone to spitting out wildly incorrect answers to search queries? In one instance, AI Overviews told a user to use glue on pizza to make sure the cheese won’t slide off (pssst…please don’t do this.)
Well, according to an interview at The Vergewith Google CEO Sundar Pichai published earlier this week, just before criticism of the outputs really took off, these “hallucinations” are an “inherent feature” of AI large language models (LLM), which is what drives AI Overviews, and this feature “is still an unsolved problem.”
They keep saying it’s impossible, when the truth is it’s just expensive.
That’s why they wont do it.
You could only train AI with good sources (scientific literature, not social media) and then pay experts to talk with the AI for long periods of time, giving feedback directly to the AI.
Essentially, if you want a smart AI you need to send it to college, not drop it off at the mall unsupervised for 22 years and hope for the best when you pick it back up.
it’s just expensive
I’m a mathematician who’s been following this stuff for about a decade or more. It’s not just expensive. Generative neural networks cannot reliably evaluate truth values; it will take time to research how to improve AI in this respect. This is a known limitation of the technology. Closely controlling the training data would certainly make the information more accurate, but that won’t stop it from hallucinating.
The real answer is that they shouldn’t be trying to answer questions using an LLM, especially because they had a decent algorithm already.
Yeah, I’ve learned Neural Networks way back when those thing were starting in the late 80s/early 90s, use AI (though seldom Machine Learning) in my job and really dove into how LLMs are put together when it started getting important, and these things are operating entirelly at the language level and on the probabilities of language tokens appearing in certain places given context and do not at all translate from language to meaning and back so there is no logic going on there nor is there any possibility of it.
Maybe some kind of ML can help do the transformation from the language space to a meaning space were things can be operated on by logic and then back, but LLMs aren’t a way to do it as whatever internal representation spaces (yeah, plural) they use in their inners layers aren’t those of meaning and we don’t really have a way to apply logic to them).
It’s worse than that. “Truth” can no more reliably found by machines than it can be by humans. We’ve spent centuries of philosophy trying to figure out what is “true”. The best we’ve gotten is some concepts we’ve been able to convince a large group of people to agree to.
But even that is shaky. For a simple example, we mostly agree that bleach will kill “germs” in a petri dish. In a single announcement, we saw 40% of the American population accept as “true” that bleach would also cure them if injected straight into their veins.
We’re never going to teach machine to reason for us when we meatbags constantly change truth to be what will be profitable to some at any given moment.
Are you talking about epistemics in general or alethiology in particular?
Regardless, the deep philosophical concerns aren’t really germain to the practical issue of just getting people to stop falling for obvious misinformation or people being wantonly disingenuous to score points in the most consequential game of numbers-go-up.
I’m addition to the other comment, I’ll add that just because you train the AI on good and correct sources of information, it still doesn’t necessarily mean that it will give you a correct answer all the time. It’s more likely, but not ensured.
Yes, thank you! I think this should be written in capitals somewhere so that people could understand it quicker. The answers are not wrong or right on purpose. LLMs don’t have any way of distinguishing between the two.
That’s just not how LLMs work, bud. It doesn’t have understanding to improve, it just munges the most likely word next in line. It, as a technology, won’t advance past that level of accuracy until it’s a completely different approach.
They could also perform some additional iterations with other models on the result to verify it, or even to enrich it; but we come back to the issue of costs.
Why not solve it before training the AI?
Simply make it clear that this tech is experimental, then provide sources and context with every result. People can make their own assessment.
I think you’re right that with sufficient curation and highly structured monitoring and feedback, these problems could be much improved.
I just think that to prepare an AI, in such a way, to answer any question reliably and usefully would require more human resources than there are elementary particles in the universe. We would be better off connecting live college educated human operators to Google search to individually assist people.
So I don’t know how helpful it is to say “it’s just expensive” when the entire point of AI is to be lower cost than a battalion of humans.
The solution to the problem is to just pull the plug on the AI search bullshit until it is actually helpful.
Honestly, they could probably solve the majority of it by blacklisting Reddit from fulfilling the queries.
But I heard they paid for that data so I guess we’re stuck with it for the foreseeable future.
Don’t wait for it, usage data is valuable to them.
If you can’t fix it, then get rid of it, and don’t bring it back until we reach a time when it’s good enough to not cause egregious problems (which is never, so basically don’t ever think about using your silly Gemini thing in your products ever again)
Corps hate looking bad. Especially to shareholders. The thing is, and perhaps it doesn’t matter, most of us actually respect the step back more than we do the silly business decisions for that quarterly .5% increase in a single dot on a graph. Of course, that respect doesn’t really stop many of us from using services. Hell, I don’t like Amazon but I’ll say this: I still end up there when I need something, even if I try to not end up there in the first place. Though I do try to go to the website of the store instead of using Amazon when I can.
I miss the olden days of Google
Since when has feeding us misinformation been a problem for capitalist parasites like Pichai?
Misinformation is literally the first line of defense for them.
But this is not misinformation, it is uncontrolled nonsense. It directly devalues their offering of being able to provide you with an accurate answer to something you look for. And if their overall offering becomes less valuable, so does their ability to steer you using their results.
So while the incorrect nature is not a problem in itself for them, (as you see from his answer)… the degradation of their ability to influence results is.
But this is not misinformation, it is uncontrolled nonsense.
The strategy is to get you to keep feeding Google new prompts in order to feed you more adds.
The AI response is just a gimmick. It gives Google something to tell their investors, when they get asked “What are you doing with AI right now? We hear that’s big.”
But the real money is getting unique user interactions for the purpose of serving up more ad content. In that model, bad answers are actually better than no answers, because they force the end use to keep refining the query and searching through the site backlog.
If you don’t know the answer is bad, which confident idiots spouting off on reddit and being upvoted into infinity has proven is common, then you won’t refine your search. You’ll just accept the bad answer and move on.
Your logic doesn’t follow. If someone doesn’t know the answer and are searching for it, they likely won’t be able to tell if the answer is correct. We literally already have that problem with misinformation. And what sounds more confident than an AI?
I don’t believe they will retain user interactions if the reason for the user interactions dissapears. The value of Google is they provide accurate search results.
I can understand some users just want to be spoonfed an answer. But that’s not what most people expect from a search engine.
I want google to use actual AI to filter out all the nonsense sites that turn a Reddit post into an article of 500 words using an LLM without any actual value. That should be googles proposition.
The value of Google is they provide accurate search results.
They offer the most accurate results of search engines you’re familiar with. But in a shrinking field with degrading quality, that’s a low bar and sinking quick.
I want google to use actual AI to filter out all the nonsense sites
So did the last head of Google search, until the new CEO fired him.
Google isn’t bothered by incorrect results because search results are no longer their product. Constantly rising stock values are their product now. Hype is their path to those higher values.
That is actually a really good observation.
But this is not misinformation, it is uncontrolled nonsense.
Fair enough… but drowning out any honest discourse with a flood of histrionic right-wing horseshit has always been the core strategy of the US propaganda model - I’d say that their AI is just doing the logical thing and taking the horseshit to a very granular level. I mean… “put glue on your pizza” is just not that far off “drink bleach to kill viruses on the inside.”
I know I’m describing a pattern that probably wasn’t intentional (I hope) - but the pattern does look like it could fit.
Oh don’t get me wrong I know exactly what you mean and I agree… it’s just that the LLMs are spewing actual nonsense and that breaks the whole principle of what a search engine should do… provide me accurate results.
AI isn’t giving the right misinformation
Well, we can’t have that, can we?
LLMs trained on shitposting are too obvious for it to be quality misinformation.
For quality disinformation they should train them solely on MBA course-work and documents produced by people with MBAs.
Sure, the rate of false information would be even worse, but it would be formatted in slick ways meant to obfuscate meaning, which would avoid the kind of hilarity that has ensued when Google deployed an LLM trained on Reddit data and thus be much better for Google’s stock price.
It is probably the most telling demonstration of the terrible state of our current society, that one of the largest corporations on earth, which got where it is today by providing accurate information, is now happy to knowingly provide incorrect, and even dangerous information, in its own name, an not give a flying fuck about it.
Wikipedia got where it is today by providing accurate information. Google results have always been full of inaccurate information. Sorting through the links for respectable sources just became second nature, then we learned to scroll past ads to start sorting through links. The real issue with misinformation from an AI is that people treat it like it should be some infallible Oracle - a point of view only half-discouraged by marketing with a few warnings about hallucinations. LLMs are amazing, they’re just not infallible. Just like you’d check a Wikipedia source if it seemed suspect, you shouldn’t trust LLM outputs uncritically. /shrug
and our parents told us Wikipedia couldn’t be trusted…
Huh. That made me stop and realize how long I’ve been around. Wikipedia still feels like a new addition to society to me, even though I’ve been using it for around 20 years now.
And what you said, is something I’ve cautioned my daughter about, and first said that to her about ten years ago.
Conservapedia to the rescue.
“It’s broken in horrible, dangerous ways, and we’re gonna keep doing it. Fuck you.”
you need ai if you want your stock to go up
Do you need AI or do you just need to use the term AI? Because it seems like the latter is usually enough.
Step 1. Replace CEO with AI. Step 2. Ask New AI CEO, how to fix. Step 3. Blindly enact and reinforce steps
Rip up the Reddit contract and don’t use that data to train the model. It’s the definition of a garbage in garbage out problem.
mithtaketh were made
Myth takes were made
So if a car maker releases a car model that randomly turns abruptly to the left for no apparent reason, you simply say “I can’t fix it, deal with it”? No, you pull it out of the market, try to fix it and, if this it is not possible, then you retire the model before it kills anyone.
I bet if there weren’t angencies forcing them to do this they wouldn’t recall.
Or you market it as a Tesla’s self driving mode
simply say “I can’t fix it, deal with it”
That’s pretty much the business model of Tech Giants and AAA game makers.
This is so wild to me… as a software engineer, if my software doesn’t work 100% of the time as requested in the specification, it fails tests, doesn’t get released and I get told to fix all issues before going live.
AI is basically another word for unrealiable software full of bugs.
I mean they could disable it until it works, else it’s knowingly misleading people
Obviously you don’t have a business degree.
Google CEO essentially says the first result should not be trusted.
This is what happens every time society goes along with tech bro hype. They just run directly into a wall. They are the embodiment of “Didn’t stop to think if they should” and it’s going to cause a lot of problems for humanity.
The answer is dont inflate your stock price by cramming the latest tech du jour in to your flagship product… but we all know thats not an option.