Help with Home Server Architecture and Hardware Selection?
-
[email protected]replied to [email protected] last edited by
The HA stuff is as hard as prepping the cluster and making sure it's repping fine, then enable whichever guests you want to HA. It's seriously not difficult at all.
-
[email protected]replied to [email protected] last edited by
Note, I say DDR4 for cost reasons, if you're willing, able and desire to spend more upfront for some future proofing DDR5 is newer and faster and comes in higher per-stick capacities. But it is considerably more expensive than DDR4 for the newness
I clocked my 2 full 2u servers (1Us and 2Us (esp 1Us) tend to be the louder screecher variety because of the considerably smaller fans, 3 and 4U servers trend towards more quiet and lower tone, closer to a typical desktop fan) at 82DBs on a desible meter app whose most powerful GPU is a single 1080 for transcoding purposes that isn't even under load ATM
A TrueNAS probably wouldn't be too bad, if you're the type to enjoy white pose to sleep it might even be beneficial.
The Proxmox will be the one with the 2-4 GPUs yes? It'll fucking sound like a 747 is taking off in your bedroom whenever it's under load
Also don't forget cooling, I basement is a good option because its naturally cooler, but you'll still need to ensure good airflow. Assuming your basement is not in a hot state/country, if it is you'll need to explore dedicated active cooling systems. If you own or otherwise can make modifications, a split mini/heat pump system would do well.
It will generate considerable heat, I posted a meme just the other day about my servers doubling as a heater supplement system. Kinda of exaggerating for the meme, but it does have an effect. It increases the temp in my basement office 8-10 degrees under load
-
[email protected]replied to [email protected] last edited by
For llama 70B I'm using an rtx a6000; slightly older but it does the job magnificently with hers 48gb of vram.
-
[email protected]replied to [email protected] last edited by
Their prices lately have been very unimpressive.
-
[email protected]replied to [email protected] last edited by
This could also be caused by a bad connection or poor contact between the wire and the receptacle. Notice the side is melted, where the terminal screws would be, thats where the heat would be generated. When you put a load on it and electrons have to jump the gap it arcs and generates heat. Load is also a factor, on this receptacle or any downstream, but the melting on the side might be caused by arcing.
-
[email protected]replied to [email protected] last edited by
I'm running 70b on two used 3090 and an a6000 nvlink. I think i got these for $900ea, and maybe $200 for the nvlink. Also works great.
-
[email protected]replied to [email protected] last edited by
Honestly why not just use an old laptop you have laying around to test 1 or 2 of your many project/ideas and see how it goes, before going 4000 $ deep.
-
[email protected]replied to [email protected] last edited by
I'm also on p2p 2x3090 with 48GB of VRAM. Honestly it's a nice experience, but still somewhat limiting...
I'm currently running deepseek-r1-distill-llama-70b-awq with the aphrodite engine. Though the same applies for llama-3.3-70b. It works great and is way faster than ollama for example. But my max context is around 22k tokens. More VRAM would allow me more context, even more VRAM would allow for speculative decoding, cuda graphs, ...
Maybe I'll drop down to a 35b model to get more context and a bit of speed. But I don't really want to justify the possible decrease in answer quality.
-
[email protected]replied to [email protected] last edited by
ZFS Raid Expansion has been released days ago in OpenZFS 2.3.0 : https://www.cyberciti.biz/linux-news/zfs-raidz-expansion-finally-here-in-version-2-3-0/
It might help you with deciding how much storage you want
-
[email protected]replied to [email protected] last edited by
I'm curious, how do you run the 4x3090s? The FE Cards would be 4x3=12 PCIe slots and 4x16=64 PCIe lanes... Did you nvlink them? What about transient power spikes? Any clock or even VBIOS mods?
-
[email protected]replied to [email protected] last edited by
I have some nvlinks on the way.
Sooooo I’ve got a friend that used pcie-oculus and then back to pcie to allow the cards to run outside the case, but that’s not what I do, that’s just the more common approach.
You can also get pcie extension cables, but they’re pricey.
I stumbled upon a cubix device by chance which is a huge and really expensive pcie bus extender that does some really fancy fucking switching. But I got that at a ridiculous price and they’re hard to come by.
If I do it right, I could host 10 cards total (2 in the machine and 8 in the cubix)
This also means that I’m running 3x 1600w psu’s and I’m most at risk for blowing breakers (adding in a 240V line is next lol)
-
[email protected]replied to [email protected] last edited by
I did a double take at that $4000 budget as well! Glad I wasn't the only one.
-
[email protected]replied to [email protected] last edited by
as with all things. prepare for it to get worse in every aspect
-
[email protected]replied to [email protected] last edited by
Woah, this is big news!! I'd been following some of the older articles talking about this being pending, but had no idea it just released, thanks for sharing! Will just need to figure out how much of a datahoarder I'm likely to become, but it might be nice to start with fewer than 6 of the 8TB drives and expand up (though I think 4 drives is the minimum that makes sense; my understanding is also that energy consumption is roughly linear with number of drives, though that could be very wrong, so maybe I've even start with 4x a 10-12TB drive if I can find them for a reasonable price). But thanks for flagging this!
-
[email protected]replied to [email protected] last edited by
You are both totally right. I think I anchored high here just because of the LLM stuff I am trying to get running at around a GPT4 level (which is what I think it will take for folks in my family to actually use it vs. continuing to pass all their data to OpenAI) and it felt like it was tough to get there without spending an arm and a leg on GPUs alone. But I think my plan is now to start with the NAS build, which I should be able to accomplish without spending a crazy amount and then building out iteratively from there. As you say, I'd prefer to screw up and make a $500 mistake vs. a multiple thousand dollar one. Thanks for the sanity check!
-
[email protected]replied to [email protected] last edited by
Thank you so much for all of this! I think you're definitely right that probably starting smaller and trying a few things out is more sensible. At least for now I think I am going to focus on putting something together for the lower-hanging fruit by focusing on the NAS build first and then build up to local AI once I have something stable (but I'll definitely be keeping an eye out for GPU deals in the meantime, so thanks for mentioning the B580 variant, it wasn't on my radar at all as an option). But I think the thread has definitely given me confidence that splitting things out that way makes sense as a strategy (I had been concerned when I first wrote it out that not planning out everything all at once was going to cause me to miss some major efficiency, but I feel like it turns out that self-hosting is more like gardening than I thought in that it sort of seems to grow organically with one's interest and resources over time; sort of sounds obvious in retrospect, but I was definitely approaching this more rigidly initially). And thank you for the HDD rec! I think the Exos are the level above the Ironwolf Pro I mentioned, so will definitely consider them (especially if they come back online for a reasonable price at serverpartdeals or elsewhere). Just out of curiosity, what are you using for admin on your MC server? I had heard of Pterodactyl previously, but another commenter mentioned CraftyController as a bit easier to work with. Thank you again for writing all of this up, it's super helpful!
-
[email protected]replied to [email protected] last edited by
This is definitely good advice. I tend to run my laptops into the ground before I replace them, but a lot of the feedback here has made me think experimenting with something much less expensive first is probably the right move instead of trying to do everything all at once (so that when I inevitably screw up, it at least won't be a $4k screw up.) But thanks for the sanity check!
-
[email protected]replied to [email protected] last edited by
Wow, that sounds amazing! I think that GPU alone would probably exceed my budget for the whole build lol. Thanks for sharing!
-
[email protected]replied to [email protected] last edited by
This is exactly the sort of tradeoff I was wondering about, thank you so much for mentioning this. I think ultimately I would probably align with you in prioritizing answer quality over context length (but it sure would be nice to have both!!) I think my plan for now based on some of the other comments is to go ahead with the NAS build and keep my eyes peeled for any GPU deals in the meantime (though honestly I am not holding my breath). Once I've proved to myself I can something stable without burning the house down, I'll on something more powerful for the localLLM. Thanks again for sharing!
-
[email protected]replied to [email protected] last edited by
Thanks for sharing! Will probably try to go this route once I get the NAS squared away and turn back to localLLMs. Out of curiosity, are you using the q4_k_m quantization type?