AI Hallucinations Detection
Do you have trouble detecting AI hallucinations?
I know they are out there, but they don’t seem to appear when I use CoPilot. Of course, I’m using CoPilot on topics where I’m not an expert.
When it comes to the programming or setting up new Excel tricks, I can immediately tell when CoPilot is wrong because the program or function or process does not work. It’s an immediate feedback.
But non-technical stuff becomes more difficult.
Current state of AI
Daily there are new articles proclaiming new frontiers that AI can conquer, but it is just not ready yet for prime time due to hallucinations. The latest I just read this afternoon concerns using CoPilot for your medical needs. Not sure why anyone would want to use AI for their medical needs but just in case you were thinking of doing so, DON’T.
“42% of AI answers were considered to lead to moderate or mild harm, and 22% to death or severe harm.” A damning research paper suggests that Bing / Microsoft Copilot AI medical advice may actually kill you.
Jez Cordan, “”42% of AI answers were considered to lead to moderate or mild harm, and 22% to death or severe harm.” A damning research paper suggests that Bing / Microsoft Copilot AI medical advice may actually kill you.”, Windows Central, 10/12/2024
Ouch!
Yes, you still need to go to a doctor – a human doctor.
But I have this lingering anxious feeling that I may be consuming fake information in non-detectable ways.
Exploring AI Hallucinations
So, I have an AI discussion group that I attend weekly, and we’ve started trying to do experiments with AI to learn more how to use it productively. The past few weeks we’ve been trying to determine how to spot hallucinations because it is not easy.
The first effort started off with questions surrounding the early days of computers. One of the attendees was actually around and was a participant in the beginning. I think she worked at Apple and may have noodled around in other early tech companies. The first question was to ask CoPilot for the major software for Apple products.
Something like that.
She got one set of answers, and I received another. The response wasn’t far off but in mine, there was a nuance that a regular everyday person (like me) would not recognize. The output I received were actually Apple’s operating software that came with the system; in other words, they were not sold separately as software.
So the answer really wasn’t quite right but I didn’t recognize the error.
We went on with quite a few questions, but we really didn’t discern any out and out errors, so then we decided to change tactics: force it to hallucinate.
We decided to ask it a question that was esoteric but that was known to be out in the web. Something obscure. We picked an old website that had articles written back in the 80s and was still there.
And that’s when CoPilot went off the rails.
The expert said that her AI was kind of not working before that point, but it really started to act up when we tried to push it to tell lead us to that old website. It wouldn’t go there. Instead, it told us that the library had copies of the articles in that old website.
“Yeah, but I wanted the link to that site, not the library!” she typed to CoPilot.
And then it gave some strange response:
Huh?
Exploratory Effort Two
A couple of weeks later, we tried another topic: the Great Recession.
The generative AI that we explored were ChatGPT 1o, CoPilot, Gemini, and Perplexity.
And they all gave the same generic answers.
This was a tougher topic in that none of us were experts on the Great Recession. The only qualification I had for it was that I read a LOT of books after 2008. I wanted to know what had led to the financial collapse and possibly how I could make sure I do the right thing at work.
Since we used different generative AI, this gave us the chance to compare the answers. They all pretty much said the same thing with maybe some minor variations. I even asked my CoPilot to tell me if the Perplexity answer was correct and it came back with yes but with some little clarification or additions.
Findings on AI Hallucinations
Here are some general thoughts on trying to detect AI hallucinations:
- Having been burned (think Google with using glue on pizza to keep the sausage on the pizza), I think most companies have place guardrails on what the responses will be in order to reduce incidence of hallucinations.
- I believe the responses will be what I call “safe” answers. On the topic of the Great Recession of 2008, there is some dispute around the impact of the partial repeal of the Glass Steagall Act during the Clinton administration. Some said there was no impact and others said there was. These AI will probably steer away from contentious topics.
- With that in mind, the answers should start to converge to the same bland answers.
- In order to catch the hallucinations, you have to be an expert in the topic. But then, why would you ask it questions? Trying to validate the information could become time consuming and I believe that is why employees are saying that AI is actually leading to more work. So, we need quick and easy way of doing the validation.
- To elicit materials more interesting and possibly better potential for investigation, you will have to ask the AI to provide outliers. One possible way of doing this is to include in the prompt: “You are an investigative reporter. What are some of the questions not being asked about this topic?” I often like to ask it to give me outliers that few is talking about.
- To speed up your validation of the information, you might want to consider asking each AI to review other AI’s responses for accuracy. You could even ask it to compare and contrast its answers with another AI’s answer. Basically, use the AI against each other.
Conclusions
I am almost thinking that the hallucinations will become less and less of an issue as the years roll on. The companies will get better at setting up guardrails around the answers.
But you could probably use AI to do the validation for you but do it as a competitive fashion: have one AI proofread another AI.
You must be logged in to post a comment.