Why I am not impressed by A.I.
-
[email protected]replied to [email protected] last edited by
There's also a "r" in the first half of the word, "straw", so it was completely skipping over that r and just focusing on the r's in the word "berry"
-
[email protected]replied to [email protected] last edited by
I asked mistral/brave AI and got this response:
How Many Rs in Strawberry
The word "strawberry" contains three "r"s. This simple question has highlighted a limitation in large language models (LLMs), such as GPT-4 and Claude, which often incorrectly count the number of "r"s as two. The error stems from the way these models process text through a process called tokenization, where text is broken down into smaller units called tokens. These tokens do not always correspond directly to individual letters, leading to errors in counting specific letters within words.
-
[email protected]replied to [email protected] last edited by
sounds like a perfectly sane idea https://freethoughtblogs.com/pharyngula/2025/02/05/ai-anatomy-is-weird/
-
[email protected]replied to [email protected] last edited by
I think these are actually valid examples, albeit ones that come with a really big caveat; you're using AI in place of a skill that you really should be learning for yourself. As an autistic IT person, I get the struggle of communicating with non-technical and neurotypical people, especially clients who you have to be extra careful with. But the reality is, you can't always do all your communication by email. If you always rely on the AI to correct your tone or simplify your language, you're choosing not to build an essential skill that is every bit as important to doing your job well as it is to know how to correctly configure an ACL on a Cisco managed switch.
That said, I can also see how relying on the AI at first can be a helpful learning tool as you build those skills. There's certainly an argument that by using tools, but paying attention to the output of those tools, you build those skills for yourself. Learning by example works. I think used in that way, there's potentially real value there.
Which is kind of the broader story with Gen AI overall. It's not that it can never be useful; it's that, at best, it can only ever aspire to "useful." No one, yet, has demonstrated any ability to make AI "essential" and the idea that we should be investing hundreds of billions of dollars into a technology that is, on its best days, mildly useful, is sheer fucking lunacy.
-
[email protected]replied to [email protected] last edited by
The issue is that AI is being invested in as if it can replace jobs. That's not an issue for anyone who wants to use it as a spellchecker, but it is an issue for the economy, for society, and for the planet, because billions of dollars of computer hardware are being built and run on the assumption that trillions of dollars of payoff will be generated.
And correcting someone's tone in an email is not, and will never be, a trillion dollar industry.
-
[email protected]replied to [email protected] last edited by
Sure, but for what purpose would you ever ask about the total number of a specific letter in a word? This isn't the gotcha that so many think it is. The LLM answers like it does because it makes perfect sense for someone to ask if a word is spelled with a single or double "r".
-
[email protected]replied to [email protected] last edited by
The dumbed down text is basically as long as the prompt. Plus you have to double check it to make sure it didn't have outrage instead of outage just like if you wrote it yourself.
Are you really saving time?
-
[email protected]replied to [email protected] last edited by
If you always rely on the AI to correct your tone or simplify your language, you’re choosing not to build an essential skill that is every bit as important to doing your job well as it is to know how to correctly configure an ACL on a Cisco managed switch.
This is such a good example of how it AI/LLMs/whatever are being used as a crutch that is far more impactful than using a spellchecker. A spell checker catches typos or helps with unfamiliar words, but doesn't replace the underlying skill of communicating to your audience.
-
[email protected]replied to [email protected] last edited by
Dumbed down doesn't mean shorter.
-
[email protected]replied to [email protected] last edited by
Yes, I'm saving time. As I mentioned in my other comment:
Yeah, normally my "Make this sound better" or "summarize this for me" is a longer wall of text that I want to simplify, I was trying to keep my examples short.
And
and helps correct my shitty grammar at times.
And
Hallucinations are a thing, so validating what it spits out is definitely needed.
-
[email protected]replied to [email protected] last edited by
It works well. For example, we had a work exercise where we had to write a press release based on an example, then write a Shark Tank pitch to promote the product we came up with in the release.
I gave AI the link to the example and a brief description of our product, and it spit out an almost perfect press release. I only had to tweak a few words because there were specific requirements I didn't feed the AI.
Then I told it to take the press release and write the pitch based on it.
Again, very nearly perfect with only having to change the wording in one spot.
-
[email protected]replied to [email protected] last edited by
If the amount of time it takes to create the prompt is the same as it would have taken to write the dumbed down text, then the only time you saved was not learning how to write dumbed down text. Plus you need to know what dumbed down text should look at to know if the output is dumbed down but still accurate.
-
[email protected]replied to [email protected] last edited by
AI is slower and less efficient than the older search algorithms and is less accurate.
-
[email protected]replied to [email protected] last edited by
They’re pretty good at summarizing, but don’t trust the summary to be accurate, just to give you a decent idea of what something is about.
That is called being terrible at summarizing.
-
[email protected]replied to [email protected] last edited by
y do you ask?
-
[email protected]replied to [email protected] last edited by
We also didn't make the model T suggest replacing the engine when the oil light comes on. Cars, as it happens, aren't that great at self diagnosis, despite that technology being far simpler and further along than generative models are. I don't trust the model to tell me what temperature to bake a cake at, I'm sure at hell not going to trust it with medical information. Googling symptoms was risky at best before. It's a horror show now.
-
[email protected]replied to [email protected] last edited by
This is a bad example.. If I ask a friend "is strawberry spelled with one or two r's"they would think I'm asking about the last part of the word.
The question seems to be specifically made to trip up LLMs. I've never heard anyone ask how many of a certain letter is in a word. I've heard people ask how you spell a word and if it's with one or two of a specific letter though.
If you think of LLMs as something with actual intelligence you're going to be very unimpressed.. It's just a model to predict the next word.
-
[email protected]replied to [email protected] last edited by
It makes perfect sense if you do mental acrobatics to explain why a wrong answer is actually correct.
-
[email protected]replied to [email protected] last edited by
How do you validate the accuracy of what it spits out?
Why don't you skip the AI and just use the thing you use to validate the AI output?
-
[email protected]replied to [email protected] last edited by
It wasn't focusing on anything. It was generating text per its training data. There's no logical thought process whatsoever.