Save The Planet

[email protected]

He isn't talking about locally, he is talking about what it takes for the AI providers to provide the AI.

To say "it takes more energy during training" entirely depends on the load put on the inference servers, and the size of the inference server farm.

[email protected]

TBH most people still use old SDXL finetunes for porn, even with the availability of newer ones.

[email protected]

And including the word "fuck" in your query no longer stops it.

[email protected]

Not at all. Not even close.

Image generation is usually batched and takes seconds, so 700W (a single H100 SXM) for a few seconds for a batch of a few images to multiple users. Maybe more for the absolute biggest (but SFW, no porn) models.

LLM generation takes more VRAM, but is MUCH more compute-light. Typically one has banks of 8 GPUs in multiple servers serving many, many users at once. Even my lowly RTX 3090 can serve 8+ users in parallel with TabbyAPI (and modestly sized model) before becoming more compute bound.

So in a nutshell, imagegen (on an 80GB H100) is probably more like 1/4-1/8 of a video game at once (not 8 at once), and only for a few seconds.

Text generation is similarly efficient, if not more. Responses take longer (many seconds, except on special hardware like Cerebras CS-2s), but it parallelized over dozens of users per GPU.

This is excluding more specialized hardware like Google's TPUs, Huawei NPUs, Cerebras CS-2s and so on. These are clocked far more efficiently than Nvidia/AMD GPUs.

...The worst are probably video generation models. These are extremely compute intense and take a long time (at the moment), so you are burning like a few minutes of gaming time per output.

ollama/sd-web-ui are terrible analogs for all this because they are single user, and relatively unoptimized.

[email protected]

"The plane is flying, anyway."

[email protected]

When I’m told there’s power issues and to conserve power I drop my AC to 60 and leave all my lights on. Only way for them to fix the grid is to break it.

[email protected]

I like tits.

[email protected]

Use Qwen 2.5, that's my recommendation. You can also set "pals". And the best part, is I have a portable battery and solar charger, so I could theoretically (and have in the past) run it from solar alone.

[email protected]

I meant to mention the other ones at fault, but I edited what I was typing and backspaced that part.

Thanks

[email protected]

Are you interpreting my statement as being in favour of training AIs?

[email protected]

So you think they're all at full load at all times? Does that seem reasonable to you?

[email protected]

It's literally the same thing, the obvious difference is how much usage it's getting at a time per gpu, but everyone seems to assume all these data centers are running at full load at all times for some reason?

[email protected]

There's no functional difference aside from usage and scale, which is my point.

I find it interesting that the only actual energy calculations I see from researchers is the training and the things going along with the training, rather then the usage per actual request after training.

People then conflate training energy costs to normal usage cost without data to back it up. I don't have the data either but I do have what I can do/see on my side.

[email protected]

Then I guess it's time to stop using Google!

[email protected]

I'm here waiting for it

[email protected]

Doesn't seem to be a waste of power to me.

[email protected]

the difference between demand and net demand in that graph is purely solar/wind generation, isn't it?

[email protected]

And when it did it also altered the results, making them worse, because it was trying to satisfy "fuck" as part of your search.

[email protected]

Do the new models even have non-"smart" fittings? I thought all the electronic chip plants closed during covid.

[email protected]

Given that cloud providers are desperately trying to get more compute resources, but are limited by chip production - yes, of course? Why do you think they're trying to expand their resources while their existing resources aren't already limited?

agnos.is Forums

Save The Planet