AGI achieved 🤖

[email protected]

If you've ever heard Germans try to pronounce "squirrel", it's hilarious. I've known many extremely bilingual Germans who couldn't pronounce it at all. It came out sounding roughly like "squall", or they'd over-pronounce the "r" and it would be "squi-rall"

[email protected]

then continue to shill it for use cases it wasn't made for either

The only thing it was made for is "spicy autocomplete".

[email protected]

Sqverrrrl.

[email protected]

Imagine asking a librarian "What was happening in Los Angeles in the Summer of 1989?" and that person fetching you ... That's modern LLMs in a nutshell.

I agree, but I think you're still being too generous to LLMs. A librarian who fetched all those things would at least understand the question. An LLM is just trying to generate words that might logically follow the words you used.

IMO, one of the key ideas with the Chinese Room is that there's an assumption that the computer / book in the Chinese Room experiment has infinite capacity in some way. So, no matter what symbols are passed to it, it can come up with an appropriate response. But, obviously, while LLMs are incredibly huge, they can never be infinite. As a result, they can often be "fooled" when they're given input that semantically similar to a meme, joke or logic puzzle. The vast majority of the training data that matches the input is the meme, or joke, or logic puzzle. LLMs can't reason so they can't distinguish between "this is just a rephrasing of that meme" and "this is similar to that meme but distinct in an important way".

[email protected]

And people are trusting these things to do jobs / parts of jobs that humans used to do.

[email protected]

How do you pronounce "Mrs" so that there's an "r" sound in it?

[email protected]

You can even drop the "a" and "g". There isn't even "intelligence" here. It's not thinking, it's just spicy autocomplete.

[email protected]

People who think that LLMs having trouble with these questions is evidence one way or another about how good or bad LLMs are just don't understand tokenization. This is not a symptom of some big-picture deep problem with LLMs; it's a curious artifact like in a jpeg image, but doesn't really matter for the vast majority of applications.

You may hate AI but that doesn't excuse being ignorant about how it works.

[email protected]

Oh yeah, I forgot about how they add a "v" sound to it.

[email protected]

Machine learning algorithm from 2017, scaled up a few orders of magnitude so that it finally more or less works, then repackaged and sold by marketing teams.

[email protected]

You've missed something about the Chinese Room. The solution to the Chinese Room riddle is that it is not the person in the room but rather the room itself that is communicating with you. The fact that there's a person there is irrelevant, and they could be replaced with a speaker or computer terminal.

Put differently, it's not an indictment of LLMs that they are merely Chinese Rooms, but rather one should be impressed that the Chinese Room is so capable despite being a completely deterministic machine.

If one day we discover that the human brain works on much simpler principles than we once thought, would that make humans any less valuable? It should be deeply troubling to us that LLMs can do so much while the mathematics behind them are so simple. Arguments that because LLMs are just scaled-up autocomplete they surely can't be very good at anything are not comforting to me at all.

[email protected]

These sorts of artifacts wouldn't be a huge issue except that AI is being pushed to the general public as an alternative means of learning basic information. The meme example is obvious to someone with a strong understanding of English but learners and children might get an artifact and stamp it in their memory, working for years off bad information. Not a problem for a few false things every now and then, that's unavoidable in learning. Thousands accumulated over long term use, however, and your understanding of the world will be coarser, like the Swiss cheese with voids so large it can't hold itself up.

[email protected]

And yet they can seemingly spell and count (small numbers) just fine.

[email protected]

"His property"

Otherwise it's just Ms.

[email protected]

That's a very long answer to my snarky little comment I appreciate it though. Personally, I find LLMs interesting and I've spent quite a while playing with them. But after all they are like you described, an interconnected catalogue of random stuff, with some hallucinations to fill the gaps. They are NOT a reliable source of information or general knowledge or even safe to use as an "assistant". The marketing of LLMs as being fit for such purposes is the problem. Humans tend to turn off their brains and to blindly trust technology, and the tech companies are encouraging them to do so by making false promises.

[email protected]

"Let me know if you'd like help counting letters in any other fun words!"

Oh well, these newish calls for engagement sure take on ridiculous extents sometimes.

[email protected]

I don’t, but it’s abbreviated with one.

[email protected]

Adding weights doesn't make it a fundamentally different algorithm.

We have hit a wall where these programs have combed over the totality of the internet and all available datasets and texts in existence.

There isn't any more training data to improve with, and these programs have stated polluting the internet with bad data that will make them even dumber and incorrect in the long run.

We're done here until there's a fundamentally new approach that isn't repetitive training.

[email protected]

Mrs. originally comes from mistress, which is why it retains the r.

[email protected]

I know that words are tokenized in the vanilla transformer. But do GPT and similar LLMs still do that as well? I assumed they also tokenize on character/symbol level, possibly mixed up with additional abstraction down the chain.

agnos.is Forums

AGI achieved 🤖