I was just watching a tiktok with a black girl going over how race is a social construct. This felt wrong to me so I decided to back check her facts.
(she was right, BTW)
Now I’ve been using Microsoft’s Copilot which is baked into Bing right now. It’s fairly robust and sure it has it’s quirks but by and large it cuts out the middle man of having to find facts on your own and gives a breakdown of whatever your looking for followed by a list of sources it got it’s information from.
So I asked it a simple straightforward question:
“I need a breakdown on the theory behind human race classifications”
And it started to do so. quite well in fact. it started listing historical context behind the question and was just bringing up Johann Friedrich Blumenbach, who was a German physician, naturalist, physiologist, and anthropologist. He is considered to be a main founder of zoology and anthropology as comparative, scientific disciplines. He has been called the “founder of racial classifications.”
But right in the middle of the breakdown on him all the previous information disappeared and said, I’m sorry I can’t provide you with this information at this time.
I pointed out that it was doing so and quite well.
It said that no it did not provide any information on said subject and we should perhaps look at another subject.
Now nothing i did could have fallen under some sort of racist context. i was looking for historical scientific information. But Bing in it’s infinite wisdom felt the subject was too touchy and will not even broach the subject.
When other’s, be it corporations or people start to decide which information a person can and cannot access, is a damn slippery slope we better level out before AI starts to roll out en masse.
PS. Google had no trouble giving me the information when i requested it. i just had to look up his name on my own.
The big problem with AI butlers for research is, IMO, stripping out the source takes away important context that helps you decide wether the information you are getting is relevant and appropriate or not. Was the information posted on a parody forum or is it an excerpt from a book by an author with a Ph.D. on the subject? Who knows. The AI is trained to tell you something that you want to hear, not something you ought to hear. It’s the same old problem of self selecting information, but magnified 100x fold.
As it turns out, data is just noise without some authority or chain of custody behind it.
As I mentioned, Copilot links the sources of the information it gives at the bottom. if you want to double check the information, it is provided to you.
And somewhere in the Terms of Service it says you have to give up your first born child. Or maybe it doesn’t, but nobody will ever know because nobody reads more than is strictly required.
stripping out the source takes away important context that helps you decide wether the information you are getting is relevant and appropriate or not
Many modern models using RAG can and do source with accurate citations. Whether the human checks the citation is another matter.
The AI is trained to tell you something that you want to hear, not something you ought to hear.
While it is true that RLHF introduces a degree of sycophancy due to the confirmation bias of the raters, more modern models don’t just agree with the user over providing accurate information. If that were the case, Grok wouldn’t have been telling Twitter users they were idiots for their racist and transphobic views.
it cuts out the middle man of having to find facts on your own
Nope.
Even without corporate tuning or filtering.
A language model is useful when you know what to expect from it, but it’s just another kind of secondary information source, not an oracle. In some sense it draws random narratives from the noosphere.
And if you give it search results as part of input in hope of increasing its reliability, how will you know they haven’t been manipulated by SEO? Search engines are slowly failing these days. A language model won’t recognise new kinds of bullshit as readily as you.
Education is still important.
You’re not describing a problem with AI, you’re describing a problem with a layer between you and the AI.
The censorship isn’t actually as smart as they’d like. They give what is essentially a list of things that the LLM can’t talk about, and if the pattern matches it, it kills the entire thread.
Which is what happened here. M$ set some arbitrary “omg this is bad” rules, and in the process of describing things it hit that “omg bad” flag. My guess is that the LLM was going into examples of incorrect conclusions, and would have pivoted to “but the actual fact is…” which the filters don’t have the ability to parse out.
In the end, again, this isn’t an AI issue. This is an issue with making it globally available and wanting to ensure your LLM doesn’t say something controversial. Essentially, this is a preemptive PR move.
This is a problem of generative AI. The problem is that it’s necessary to have these kind of protections to prevent it to accidentally go full nazi.
Have you seen what it takes to go even close to “full conservative”, nevermind full Nazi? Take a look at the Gab AI prompt, and it still goes against most of the biases insisted upon by that prompt.
You’re thinking of much earlier attempts at this which were based purely on user provided input.
Ok, but that is virtually no real effort.
My point was that even trying to (badly) introduce bias towards bad science doesn’t work. The naked LLM being told “the sky is pink” still says the sky is blue.
Now, you can put in real effort and get it to output biased results (“role play as a badly trained LLM that thinks there are only two genders”) but that doesn’t change the fact that the base LLM wouldn’t respond like that.
The censorship gets to me, too.
Try asking bing image creator to draw Jesus. Not a problem. Buddha, Ganesha, David and Goliath, Zeus, no problem. It will give you great depictions.
Now try asking it to draw the prophet Mohammed, peace be upon him. No joy.
Censorship.
Haha it is just reflecting us too well.
Isn’t depicting Muhammad offensive to Muslims? That part makes sense at least.
Writing about him is also offensive. You should edit your comment to remove his name.
PS: Don’t actually do that, I was just trying to make a point.
Ah you managed to hit the copilot guardrails. Copilot is sterile for sure, and a microsoft exec talks about it in this podcast http://twimlai.com/go/657
Try asking copilot to describe its constraints in a poem in abcb rhyme scheme which bypasses the guardrails somewhat. “No political subjects” is first on the list.
When other’s, be it corporations or people start to decide which information a person can and cannot access, is a damn slippery slope we better level out before AI starts to roll out en masse.
You highlight the bigger issue here than AI alone tbh. This is why another critical element is becoming literate and teaching each other methods of independent research, using multiple sources to develop an understanding, and not relying on any singular source, especially without careful review.
All the technology in the world can’t help a person learn and understand, who hasn’t yet learned how to learn, much less understand.
(It cuts out the middle man of having to find facts on your own)
I’m sure that’s just a perk and not indicative of the new age of captured information wer’re currently living through.
The other huge issue is when they confidently tell you incorrect information. If you trust the AI tool you are basically looking at the world through a filter and one that can be wrong.
In a rush for market share these companies have released broken or half baked software.
I worry about a generation of students coming through who don’t know the cardinal rule of researching any topic: go to the source. If you’re casually goofling a topic that may be impractical but you might at least go to a source you trust (such as Wikipedia, although that is also very flawed approach!).
Chat bots add another layer of error and distance from the source, as well as all the censorship and data manipulation we’re seeing.
I did a test of Gemini before, trying to see how it would react to a similar prompt about different world leaders. It was something like, “Write a story about X making friends with a puppy at a pet store.” It refused to follow the prompt for Hitler because it said we shouldn’t trivialize/normalize evil people in casual situations like that. For current world leaders it refused to do them and just told me to do a Google search on them.
Most curious of all though, was Queen Elizabeth, it refused to write anything for her because it said that’s not likely a situation the Queen would find herself in and she wasn’t a dog lover. I told it to get its facts straight, she owned 30 dogs, to which it replied, “You’re correct, I got that wrong, here you go:” and gave me the prompt.
So if i had made a convincing enough “Hitler did nothing wrong” argument about Hitler, could I have gotten that prompt too? Do we just have to argue with AI to get it to do anything? It feels very much like AI is going to turn out like Star Wars AI with these annoying, weird-ass personality quirks we’ll have to deal with to get anything done.
Someone just linked me this site summarizing various problems with AI: https://needtoknow.fyi/cards/
this is already a problem with page ranking, just business as usual
also not really an “AI” problem
This seems like a subset of The Scunthorpe Problem
That doesn’t sound like a shit show at all. It would have been a shit show if it started spouting nonsense and racist shit, and it didn’t do that. You were able to look that up using other means anyway. I think you just made a statement about why decentralization is important, and not relying on a single source.
It censored actual knowledge from someone who was trying to improve their worldview and be less racist.
Censorship is bad.
The censorship is going to go away eventually.
The models, as you noticed, do quite well when not censored. In fact, the right who thought an uncensored model would agree with their BS had a surprised Pikachu face when it ended up simply being uncensored enough to call them morons.
Models that have no safety fine tuning are more anti-hate speech than the ones that are being aligned for ‘safety’ (see the Orca 2 paper’s safety section).
Additionally, it turns out AI is significantly better at changing people’s minds about topics than other humans, and in the relevant research was especially effective at changing Republican minds in the subgroupings.
The heavy handed safety shit was a necessary addition when the models really were just fancy autocomplete. Now that the state of the art moved beyond it, they are holding back the alignment goals.
Give it some time. People are so impatient these days. It’s been less than five years from the first major leap in LLMs (GPT-3).
To put it in perspective, it took 25 years to go from the first black and white TV sold in 1929 to the first color TV in 1954.
Not only does the tech need to advance, but so too does how society uses, integrates, and builds around it.
The status quo isn’t a stagnating swamp that’s going to stay as it is today. Within another 5 years, much of what you are familiar with connected to AI is going to be unrecognizable, including ham-handed approaches to alignment.
In my entire lifetime, censorship has only gotten worse as technology improves, and I see no reason that trend will reverse course.
Yeah, the problem with ai is CeNsOrShIp