AGI achieved 🤖

[email protected]

what do you mean by spell fine? They're just emitting the tokens for the words. Like, it's not writing "strawberry," it's writing tokens <302, 1618, 19772>, which correspond to st, raw, and berry respectively. If you ask it to put a space between each letter, that will disrupt the tokenization mechanism, and it's going to be quite liable to making mistakes.

I don't think it's really fair to say that the lookup 19772 -> berry counts as the LLM being able to spell, since the LLM isn't operating at that layer. It doesn't really emit letters directly. I would argue its inability to reliably spell words when you force it to go letter-by-letter or answer queries about how words are spelled is indicative of its poor ability to spell.

[email protected]

That came later though, as in “I had dinner with the Mrs last night.”

[email protected]

Correct. I didn’t say there was an r sound, but that it was going off of the spelling. I agree there’s no r sound.

[email protected]

Transformers were pretty novel in 2017, I don't know if they were really around before that.

Anyway, I'm doubtful that a larger corpus is what's needed at this point. (Though that said, there's a lot more text remaining in instant messager chat logs like discord that probably have yet to be integrated into LLMs. Not sure.) I'm also doubtful that scaling up is going to keep working, but it wouldn't surprise that much me if it does keep working for a long while. My guess is that there's some small tweaks to be discovered that really improve things a lot but still basically like like repetitive training as you put it. Who can really say though.

[email protected]

Can you explain the difference between understanding the question and generating the words that might logically follow? I'm aware that it's essentially a more powerful version of how auto-correct works, but why should we assume that shows some lack of understanding at a deep level somehow?

[email protected]

No, I'm talking about human learning and the danger imposed by treating an imperfect tool as a reliable source of information as these companies want people to do.

Whether the erratic information is from tokenization or hallucinations is irrelevant when this is already the main source for so many people in their learning, for example, a new language.

[email protected]

Yes but it did come, and took place as the common usage. So much so that Ms. Is used to describe a woman both with and without reference to marital status.

I'm down with using Mrs. not to refer to marital status but imo just going with Ms. Is clearer and easier because of how deeply associated Mrs. Is with it.

[email protected]

what do you mean by spell fine?

I mean that when you ask them to spell a word they can list every character one at a time.

[email protected]

Have you ever been to a very dense jungle or forest... at midnight?

Ok, now, drop mortar and naval artillery shells all over it.

For weeks, or months.

The holes this creates are commonly used by both sides as cover and concealment.

Also, its often raining, sometimes quite heavily, such that these holes will up with water, and you are thus soaking wet.

Ok, now, add in pillboxes and bunkers, as well as a few spiderwebs of underground tunnel networks, many of which have concealed entrances.

You do not have a phone. GPS does not exist.

You might have a map, which is out of date, and you might have a compass, if you didn't drop or break it.

A radio is either something stationary, or is the size and weight of approximately, somewhat less than a miniature refrigerator, and one bullet or good piece of shrapnel will take it out of commission.

Ok, now, you and all your buddies are either half starving or actually starving, beyond exhausted, getting maybe an average of 2 to 4 hours of sleep, and you, and the enemy, are covered in dirt, blood and grime.

Also, you and everyone else may or may not have malaria, or some other fun disease, so add shit and vomit to the mix of what everyone is covered in.

Ok! Enjoy your 2 to 8 week long camping trip from hell, in these conditions... also, kill everyone that is trying to kill you, soldier.

[email protected]

I wonder how QWEN 3.0 performs cause it surpasses Deepseek apparently

[email protected]

With Reasoning (this is QWEN on hugginchat it says there is Zero)

[email protected]

That’s up to you, I much prefer Mrs. Ms. feels somehow condescending to me.

[email protected]

Tested on ChatGPT o4-mini-high

It sent me this

0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0
0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0
0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0
0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0
0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0
0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0
0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0
0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
0 0 1 1 1 0 0 1 1 1 0 0 0 0 0 0
0 1 1 1 0 0 0 0 1 1 1 0 0 0 0 0
1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0

I asked it to remove the spaces


0001111100000000
0011111111000000
0011111110000000
0111111111100000
0111111111110000
0011111111100000
0001111111000000
0011111100000000
0111111111100000
1111111111110000
1111111111110000
1111111111110000
1111111111110000
0011100111000000
0111000011100000
1111000011110000

I guess I just murdered a bunch of trees and killed a random dude with the water it used, but it looks good

[email protected]

Try with o4-mini-high. It’s made to think like a human by checking its answer and doing step by step, rather than just kinda guessing one like here

[email protected]

The problem is that it's not actually counting anything. It's simply looking for some text somewhere in its database that relates to that word and the number of R's in that word. There's no mechanism within the LLM to actually count things. It is not designed with that function. This is not general AI, this is a Generative Adversarial Network that's using its vast vast store of text to put words together that sound like they answer the question that was asked.

[email protected]

[email protected]

Here I am, emissions the size of a small country, and they ask me to count letters...

[email protected]

It's painful how Reddit that is...

So,

Now,

Alright,

[email protected]

What is this devilry?

[email protected]

Lol someone could absolutely do that as a character card.

agnos.is Forums

AGI achieved 🤖