the beautiful code

[email protected]

The theory of knowledge (epistemology) is a distinct and storied area of philosophy, not a debate about semantics.

There remains to this day strong philosophical debate on how we can be sure we really "know" anything at all, and thought experiments such as the Chinese Room illustrate that "knowing" is far, far more complex than we might believe.

For instance, is it simply following a set path like a river in a gorge? Is it ever actually "considering" anything, or just doing what it's told?

[email protected]

I'm pretty sure that is how we got CORBA

now just make it construct UML models and then abandon this and move onto version 2

[email protected]

chatgpt is worse among biggest chatbots with writing codes. From my experience Deepseek > Perplexity > Gemini > Claude.

[email protected]

But numbers are also text

[email protected]

They have been pretty good on popular technologies like python & web development.

I tried to do Kotlin for Android, and they kept tripping over themselves; it's hilarious and frustrating at the same time.

[email protected]

That’s exactly what most junior devs do when stuck. They rehash the same solution over and over and it almost seems like that llms trained on code bases infer that behavior from commit histories etc.

It almost feels like on of those “we taught him these tasks incorrectly as a joke” scenarios

[email protected]

The lead dev is not available this summer to review, but you can review here: https://github.com/edzdez/sway-easyfocus/pull/22

It's not great that four changes are rolled into a single PR, but that's my issue not Claude's because they were related and I wanted to test them all at once.

[email protected]

Hello, fellow old person

[email protected]

I recently tried it for scripting simple things in python for a game. Yaknow, change char's color if they are targetted. It output a shitton of word salad and code about my specific use case in the specific scripting jargon for the game.

It all based on "Misc.changeHue(player)". A function that doesn't exist and never has, because the game is unable to color other mobs / players like that for scripting.

Anything I tried with AI ends up the same way. Broken code in 10 lines of a script, halucinations and bullshit spewed as the absolute truth. Anything out of the ordinary is met with "yes this can totally be done, this is how" and "how" doesn't work, and after sifting forums / asking devs you find out "sadly that's impossible" or "we dont actually use cpython so libraries don't work like that" etc.

[email protected]

Then I am quite confused what LLM is supposed to help me with. I am not a programmer, and I am certainly not a TypeScript programmer. This is why I postponed my eslint upgrade for half a year, since I don't have a lot of experience in TypeScript, besides one project in my college webdev class.

So if I can sit down for a couple hour to port my rather simple eslint config, which arguably is the most mechanical task I have seen in my limited programming experience, and LLM produce anything close to correct. Then I am rather confused what "real programmers" would use it for...

People here say boilerplate code, but honestly I don't quite recall the last time I need to write a lot of boilerplate code.

I have also tried to use llm to debug SELinux and docker container on my homelab; unfortunately, it is absolutely useless in that as well.

[email protected]

I use ChatGPT for Go programming all the time and it rarely has problems, I think Go is more niche than Kotlin

[email protected]

Practically all LLMs aren't good for any logic. Try to play ASCII tic tac toe against it. All GPT models lost against my four year old niece and I wouldn't trust her writing production code

Once a single model (doesn't have to be a LLM) can beat Stockfish in chess, AlphaGo in Go, my niece in tic tac toe and can one-shot (on the surface, scratch-pad allowed) a Rust program that compiles and works, than we can start thinking about replacing engineers.

Just take a look at the dotnet runtime source code where Microsoft employees currently try to work with copilot, which writes PRs with errors like forgetting to add files to projects. Write code that doesn't compile, fix symptoms instead of underlying problems, etc. (just take a look yourself).

I don't say that AI (especially AGI) can't replace humans. It definitely can and will, it's just a matter of time, but state of the Art LLMs are basically just extremely good "search engines" or interactive versions of "stack overflow" but not good enough to do real "thinking tasks".

[email protected]

Play ASCII tic tac toe against 4o a few times. A model that can't even draw a tic tac toe game consistently shouldn't write production code.

[email protected]

Cherry picking the things it doesn’t do well is fine, but you shouldn’t ignore the fact that it DOES do some things easily also.

Like all tools, use them for what they’re good at.

[email protected]

This made me laugh so hard one of the dogs came to check in on me.

[email protected]

Well yeah, it’s working from an incomplete knowledge of the code base. If you asked a human to do the same they would struggle.

LLMs work only if they can fit the whole context into their memory, and that means working only in highly limited environments.

[email protected]

I don't think it's cherry picking. Why would I trust a tool with way more complex logic, when it can't even prevent three crosses in a row? Writing pretty much any software that does more than render a few buttons typically requires a lot of planning and thinking and those models clearly don't have the capability to plan and think when they lose tic tac toe games.

[email protected]

Why would I trust a drill press when it can’t even cut a board in half?

[email protected]

Code that works is also just text.

[email protected]

A drill press (or the inventors) don't claim that it can do that, but with LLMs they claim to replace humans on a lot of thinking tasks. They even brag with test benchmarks, claim Bachelor, Master and Phd level intelligence, call them "reasoning" models, but still fail to beat my niece in tic tac toe, which by the way doesn't have a PhD in anything

LLMs are typically good in things that happened a lot during training. If you are writing software there certainly are things which the LLM saw a lot of during training. But this actually is the biggest problem, it will happily generate code that might look ok, even during PR review but might blow up in your face a few weeks later.

If they can't handle things they even saw during training (but sparsely, like tic tac toe) it wouldn't be able to produce code you should use in production. I wouldn't trust any junior dev that doesn't set their O right next to the two Xs.

agnos.is Forums

the beautiful code