Why I am not impressed by A.I.

[email protected]

They’re pretty good at summarizing, but don’t trust the summary to be accurate, just to give you a decent idea of what something is about.

That is called being terrible at summarizing.

[email protected]

y do you ask?

[email protected]

We also didn't make the model T suggest replacing the engine when the oil light comes on. Cars, as it happens, aren't that great at self diagnosis, despite that technology being far simpler and further along than generative models are. I don't trust the model to tell me what temperature to bake a cake at, I'm sure at hell not going to trust it with medical information. Googling symptoms was risky at best before. It's a horror show now.

[email protected]

This is a bad example.. If I ask a friend "is strawberry spelled with one or two r's"they would think I'm asking about the last part of the word.

The question seems to be specifically made to trip up LLMs. I've never heard anyone ask how many of a certain letter is in a word. I've heard people ask how you spell a word and if it's with one or two of a specific letter though.

If you think of LLMs as something with actual intelligence you're going to be very unimpressed.. It's just a model to predict the next word.

[email protected]

It makes perfect sense if you do mental acrobatics to explain why a wrong answer is actually correct.

[email protected]

How do you validate the accuracy of what it spits out?

Why don't you skip the AI and just use the thing you use to validate the AI output?

[email protected]

It wasn't focusing on anything. It was generating text per its training data. There's no logical thought process whatsoever.

[email protected]

Here’s a bit of code that’s supposed to do stuff. I got this error message. Any ideas what could cause this error and how to fix it? Also, add this new feature to the code.

Works reasonably well as long as you have some idea how to write the code yourself. GPT can do it in a few seconds, debugging it would take like 5-10 minutes, but that’s still faster than my best. Besides, GPT also fairly fluent in many functions I have never used before. My approach would be clunky and convoluted, while the code generated by GPT is a lot shorter.

[email protected]

This is literally just a tokenization artifact. If I asked you how many r’s are in /0x5273/0x7183 you’d be confused too.

[email protected]

i'm still not entirely sold on them but since i'm currently using one that the company subscribes to i can give a quick opinion:

i had an idea for a code snippet that could save be some headache (a mock for primitives in lua, to be specific) but i foresaw some issues with commutativity (aka how to make sure that a + b == b + a). so i asked about this, and the llm created some boilerplate to test this code. i've been chatting with it for about half an hour, and had it expand the idea to all possible metamethods available on primitive types, together with about 50 test cases with descriptive assertions. i've now run into an issue where the __eq metamethod isn't firing correctly when one of the operands is a primitive rather than a mock, and after having the llm link me to the relevant part of the docs, that seems to be a feature of the language rather than a bug.

so in 30 minutes i've gone from a loose idea to a well-documented proof-of-concept to a roadblock that can't really be overcome. complete exploration and feasibility study, fully tested, in less than an hour.

[email protected]

That makes sense as long as you're not writing code that needs to know how to do something as complex as ...checks original post... count.

[email protected]

That depends on how you use it. If you need the information from an article, but don't want to read it, I agree, an LLM is probably the wrong tool. If you have several articles and want go decide which one has the information you need, an LLM is a pretty good option.

[email protected]

Most of what I'm asking it are things I have a general idea of, and AI has the capability of making short explanations of complex things. So typically it's easy to spot a hallucination, but the pieces that I don't already know are easy to Google to verify.

Basically I can get a shorter response to get the same outcome, and validate those small pieces which saves a lot of time (I no longer have to read a 100 page white paper, instead a few paragraphs and then verify small bits)

[email protected]

I know right? It's not a fruit it's a vegetable!

[email protected]

I've already had more than one conversation where people quote AI as if it were a source, like quoting google as a source. When I showed them how it can sometimes lie and explain it's not a primary source for anything I just get that blank stare like I have two heads.

[email protected]

Just playing, friend.

[email protected]

Not mental acrobatics, just common sense.

[email protected]

That's a very different problem than the one in the OP

[email protected]

I use ai like that except im not using the same shit everyone else is on. I use a dolphin fine tuned model with tool use hooked up to an embedder and searxng. Every claim it makes is sourced.

[email protected]

Correct.

agnos.is Forums

Why I am not impressed by A.I.