Help with Home Server Architecture and Hardware Selection?
-
[email protected]replied to [email protected] last edited by
This could also be caused by a bad connection or poor contact between the wire and the receptacle. Notice the side is melted, where the terminal screws would be, thats where the heat would be generated. When you put a load on it and electrons have to jump the gap it arcs and generates heat. Load is also a factor, on this receptacle or any downstream, but the melting on the side might be caused by arcing.
-
[email protected]replied to [email protected] last edited by
I'm running 70b on two used 3090 and an a6000 nvlink. I think i got these for $900ea, and maybe $200 for the nvlink. Also works great.
-
[email protected]replied to [email protected] last edited by
Honestly why not just use an old laptop you have laying around to test 1 or 2 of your many project/ideas and see how it goes, before going 4000 $ deep.
-
[email protected]replied to [email protected] last edited by
I'm also on p2p 2x3090 with 48GB of VRAM. Honestly it's a nice experience, but still somewhat limiting...
I'm currently running deepseek-r1-distill-llama-70b-awq with the aphrodite engine. Though the same applies for llama-3.3-70b. It works great and is way faster than ollama for example. But my max context is around 22k tokens. More VRAM would allow me more context, even more VRAM would allow for speculative decoding, cuda graphs, ...
Maybe I'll drop down to a 35b model to get more context and a bit of speed. But I don't really want to justify the possible decrease in answer quality.
-
[email protected]replied to [email protected] last edited by
ZFS Raid Expansion has been released days ago in OpenZFS 2.3.0 : https://www.cyberciti.biz/linux-news/zfs-raidz-expansion-finally-here-in-version-2-3-0/
It might help you with deciding how much storage you want
-
[email protected]replied to [email protected] last edited by
I'm curious, how do you run the 4x3090s? The FE Cards would be 4x3=12 PCIe slots and 4x16=64 PCIe lanes... Did you nvlink them? What about transient power spikes? Any clock or even VBIOS mods?
-
[email protected]replied to [email protected] last edited by
I have some nvlinks on the way.
Sooooo I’ve got a friend that used pcie-oculus and then back to pcie to allow the cards to run outside the case, but that’s not what I do, that’s just the more common approach.
You can also get pcie extension cables, but they’re pricey.
I stumbled upon a cubix device by chance which is a huge and really expensive pcie bus extender that does some really fancy fucking switching. But I got that at a ridiculous price and they’re hard to come by.
If I do it right, I could host 10 cards total (2 in the machine and 8 in the cubix)
This also means that I’m running 3x 1600w psu’s and I’m most at risk for blowing breakers (adding in a 240V line is next lol)
-
[email protected]replied to [email protected] last edited by
I did a double take at that $4000 budget as well! Glad I wasn't the only one.
-
[email protected]replied to [email protected] last edited by
as with all things. prepare for it to get worse in every aspect
-
[email protected]replied to [email protected] last edited by
Woah, this is big news!! I'd been following some of the older articles talking about this being pending, but had no idea it just released, thanks for sharing! Will just need to figure out how much of a datahoarder I'm likely to become, but it might be nice to start with fewer than 6 of the 8TB drives and expand up (though I think 4 drives is the minimum that makes sense; my understanding is also that energy consumption is roughly linear with number of drives, though that could be very wrong, so maybe I've even start with 4x a 10-12TB drive if I can find them for a reasonable price). But thanks for flagging this!
-
[email protected]replied to [email protected] last edited by
You are both totally right. I think I anchored high here just because of the LLM stuff I am trying to get running at around a GPT4 level (which is what I think it will take for folks in my family to actually use it vs. continuing to pass all their data to OpenAI) and it felt like it was tough to get there without spending an arm and a leg on GPUs alone. But I think my plan is now to start with the NAS build, which I should be able to accomplish without spending a crazy amount and then building out iteratively from there. As you say, I'd prefer to screw up and make a $500 mistake vs. a multiple thousand dollar one. Thanks for the sanity check!
-
[email protected]replied to [email protected] last edited by
Thank you so much for all of this! I think you're definitely right that probably starting smaller and trying a few things out is more sensible. At least for now I think I am going to focus on putting something together for the lower-hanging fruit by focusing on the NAS build first and then build up to local AI once I have something stable (but I'll definitely be keeping an eye out for GPU deals in the meantime, so thanks for mentioning the B580 variant, it wasn't on my radar at all as an option). But I think the thread has definitely given me confidence that splitting things out that way makes sense as a strategy (I had been concerned when I first wrote it out that not planning out everything all at once was going to cause me to miss some major efficiency, but I feel like it turns out that self-hosting is more like gardening than I thought in that it sort of seems to grow organically with one's interest and resources over time; sort of sounds obvious in retrospect, but I was definitely approaching this more rigidly initially). And thank you for the HDD rec! I think the Exos are the level above the Ironwolf Pro I mentioned, so will definitely consider them (especially if they come back online for a reasonable price at serverpartdeals or elsewhere). Just out of curiosity, what are you using for admin on your MC server? I had heard of Pterodactyl previously, but another commenter mentioned CraftyController as a bit easier to work with. Thank you again for writing all of this up, it's super helpful!
-
[email protected]replied to [email protected] last edited by
This is definitely good advice. I tend to run my laptops into the ground before I replace them, but a lot of the feedback here has made me think experimenting with something much less expensive first is probably the right move instead of trying to do everything all at once (so that when I inevitably screw up, it at least won't be a $4k screw up.) But thanks for the sanity check!
-
[email protected]replied to [email protected] last edited by
Wow, that sounds amazing! I think that GPU alone would probably exceed my budget for the whole build lol. Thanks for sharing!
-
[email protected]replied to [email protected] last edited by
This is exactly the sort of tradeoff I was wondering about, thank you so much for mentioning this. I think ultimately I would probably align with you in prioritizing answer quality over context length (but it sure would be nice to have both!!) I think my plan for now based on some of the other comments is to go ahead with the NAS build and keep my eyes peeled for any GPU deals in the meantime (though honestly I am not holding my breath). Once I've proved to myself I can something stable without burning the house down, I'll on something more powerful for the localLLM. Thanks again for sharing!
-
[email protected]replied to [email protected] last edited by
Thanks for sharing! Will probably try to go this route once I get the NAS squared away and turn back to localLLMs. Out of curiosity, are you using the q4_k_m quantization type?
-
[email protected]replied to [email protected] last edited by
I'm just using basic fabric stuff running through a systemd service for my MC server. It also basically just has every single performance mod I could find and nothing else (as well as geyser+floodgate) so there isn't all that much admin stuff to do. I set up RCON (I think it's called) to send commands from my computer but I just set up everything through ssh. I haven't heard of either pterodactyl or crafty controller, I'll check those out!
-
[email protected]replied to [email protected] last edited by
Pretty sure truenas scale can host everything you want so you might only want one server. Use Epyc for the pcie lanes, and a fractal design r7 XL and you could even escape needing a rack mount if you wanted. Use a pcie to m.2 adapter and you could easily host apps on them on a mirrored pool and use a special vdev to speed up the HDD storage pool
The role of the proxmox server would essentially be filled by apps and/or VM you could turn on or off as needed.
-
[email protected]replied to [email protected] last edited by
Also, there are some “former crypto miner“ boards that are configured with SUPER wide slots for video cards exactly like this.
They’re great and cheap used because nobody wants them.
If I have to build a second one, that’s my next path.
-
[email protected]replied to [email protected] last edited by
You can still run smaller models on cheaper gpus, no need for the greatest gpu ever. Btw, I use it for other things too, not only LLMs