Help with Home Server Architecture and Hardware Selection?
-
[email protected]replied to [email protected] last edited by
You could look into building a game streaming server. Moonlight/sunshine runs decently well and if you have decent WiFi it will be fine. Theoretically you can divide up your GPU into vGPUs but the support for that is hit or miss.
-
[email protected]replied to [email protected] last edited by
Oh, they're noisy as hell when they wind up because they're doing a big backup or something. I have them in my laundry room. If you had to listen to them, you'd quickly find something else. In the end, I don't really use much processor power on these, it's more about the memory these boards will hold. RAM was dirt cheap so having 256GB available for experimenting with kube clusters and multiple docker hosts is pretty sweet. But considering that you can overprovision both proc and ram on PM guests as long as you use your head, you can get away with a lot less. I could probably have gotten by as well or better with a Ryzen with a few cores and plenty of ram, but these were cheaper.
At times, I've moved all the active guests to one node (I have the PBS server set up as a qdevice for Proxmox to keep a quorum active, it gets pissy if it thinks it's flying solo), and I'll WoL the other one periodically to let the first node replicate to the second, then down it again when it's done. If I'm going to be away for a while, I'll leave both of them running so HA can take over, which has actually happened without me even noticing that the first server packed in a drive, the failover was so seamless it took me a week to notice. That can save a bit of power, but overall, it's a kWh a day per server which in my area is about 12 cents.
I've never seen the point of TrueNAS for me. I run Nextcloud as a docker stack using the AIO mastercontainer for myself and 8 users. Together, we use about 1TB of space on it, and that's a few people with years of photos etc. So I mount a separate virtualdisk on the docker host that both nextcloud and immich can access on the same docker host, so they can share photos saved in users NC folders that get backed up from their phones. The AIO also has Collabra office set up by default, so that might satisfy your document editing ask there.
As I said, I've thought I might get an eGPU and pass it to a docker guest for using AI. I'd prefer to get my Home Assistant setup not relying on the NabuCasa server. I don't mind sending them money and the STT service that buys me works very well for voice commands around the house, but it rubs me the wrong way to rely on anything on someone else's computers. But it's brutally slow when I try to run it even on my desktop ryzen 7800 without a GPU, so until I decide to invest in a good GPU for that stuff, I'll be sending it out. At least I trust them way more than I ever would Google or Amazon. I'd do without if that was the choice.
All of this does not need to be a jump both feet first; you can just take some old laptop and start to build a PM cluster and play with this. Your only limit will be the ram.
I've also seen people build PM clusters using Mac Pro 2013 trashcans, you can get a 12core xeon with 64GB of ram for like $200 and maybe a thunderbolt enclosure for additional drives. Those would be super quiet and probably low power usage.
-
[email protected]replied to [email protected] last edited by
The HA stuff is as hard as prepping the cluster and making sure it's repping fine, then enable whichever guests you want to HA. It's seriously not difficult at all.
-
[email protected]replied to [email protected] last edited by
Note, I say DDR4 for cost reasons, if you're willing, able and desire to spend more upfront for some future proofing DDR5 is newer and faster and comes in higher per-stick capacities. But it is considerably more expensive than DDR4 for the newness
I clocked my 2 full 2u servers (1Us and 2Us (esp 1Us) tend to be the louder screecher variety because of the considerably smaller fans, 3 and 4U servers trend towards more quiet and lower tone, closer to a typical desktop fan) at 82DBs on a desible meter app whose most powerful GPU is a single 1080 for transcoding purposes that isn't even under load ATM
A TrueNAS probably wouldn't be too bad, if you're the type to enjoy white pose to sleep it might even be beneficial.
The Proxmox will be the one with the 2-4 GPUs yes? It'll fucking sound like a 747 is taking off in your bedroom whenever it's under load
Also don't forget cooling, I basement is a good option because its naturally cooler, but you'll still need to ensure good airflow. Assuming your basement is not in a hot state/country, if it is you'll need to explore dedicated active cooling systems. If you own or otherwise can make modifications, a split mini/heat pump system would do well.
It will generate considerable heat, I posted a meme just the other day about my servers doubling as a heater supplement system. Kinda of exaggerating for the meme, but it does have an effect. It increases the temp in my basement office 8-10 degrees under load
-
[email protected]replied to [email protected] last edited by
For llama 70B I'm using an rtx a6000; slightly older but it does the job magnificently with hers 48gb of vram.
-
[email protected]replied to [email protected] last edited by
Their prices lately have been very unimpressive.
-
[email protected]replied to [email protected] last edited by
This could also be caused by a bad connection or poor contact between the wire and the receptacle. Notice the side is melted, where the terminal screws would be, thats where the heat would be generated. When you put a load on it and electrons have to jump the gap it arcs and generates heat. Load is also a factor, on this receptacle or any downstream, but the melting on the side might be caused by arcing.
-
[email protected]replied to [email protected] last edited by
I'm running 70b on two used 3090 and an a6000 nvlink. I think i got these for $900ea, and maybe $200 for the nvlink. Also works great.
-
[email protected]replied to [email protected] last edited by
Honestly why not just use an old laptop you have laying around to test 1 or 2 of your many project/ideas and see how it goes, before going 4000 $ deep.
-
[email protected]replied to [email protected] last edited by
I'm also on p2p 2x3090 with 48GB of VRAM. Honestly it's a nice experience, but still somewhat limiting...
I'm currently running deepseek-r1-distill-llama-70b-awq with the aphrodite engine. Though the same applies for llama-3.3-70b. It works great and is way faster than ollama for example. But my max context is around 22k tokens. More VRAM would allow me more context, even more VRAM would allow for speculative decoding, cuda graphs, ...
Maybe I'll drop down to a 35b model to get more context and a bit of speed. But I don't really want to justify the possible decrease in answer quality.
-
[email protected]replied to [email protected] last edited by
ZFS Raid Expansion has been released days ago in OpenZFS 2.3.0 : https://www.cyberciti.biz/linux-news/zfs-raidz-expansion-finally-here-in-version-2-3-0/
It might help you with deciding how much storage you want
-
[email protected]replied to [email protected] last edited by
I'm curious, how do you run the 4x3090s? The FE Cards would be 4x3=12 PCIe slots and 4x16=64 PCIe lanes... Did you nvlink them? What about transient power spikes? Any clock or even VBIOS mods?
-
[email protected]replied to [email protected] last edited by
I have some nvlinks on the way.
Sooooo I’ve got a friend that used pcie-oculus and then back to pcie to allow the cards to run outside the case, but that’s not what I do, that’s just the more common approach.
You can also get pcie extension cables, but they’re pricey.
I stumbled upon a cubix device by chance which is a huge and really expensive pcie bus extender that does some really fancy fucking switching. But I got that at a ridiculous price and they’re hard to come by.
If I do it right, I could host 10 cards total (2 in the machine and 8 in the cubix)
This also means that I’m running 3x 1600w psu’s and I’m most at risk for blowing breakers (adding in a 240V line is next lol)
-
[email protected]replied to [email protected] last edited by
I did a double take at that $4000 budget as well! Glad I wasn't the only one.
-
[email protected]replied to [email protected] last edited by
as with all things. prepare for it to get worse in every aspect
-
[email protected]replied to [email protected] last edited by
Woah, this is big news!! I'd been following some of the older articles talking about this being pending, but had no idea it just released, thanks for sharing! Will just need to figure out how much of a datahoarder I'm likely to become, but it might be nice to start with fewer than 6 of the 8TB drives and expand up (though I think 4 drives is the minimum that makes sense; my understanding is also that energy consumption is roughly linear with number of drives, though that could be very wrong, so maybe I've even start with 4x a 10-12TB drive if I can find them for a reasonable price). But thanks for flagging this!
-
[email protected]replied to [email protected] last edited by
You are both totally right. I think I anchored high here just because of the LLM stuff I am trying to get running at around a GPT4 level (which is what I think it will take for folks in my family to actually use it vs. continuing to pass all their data to OpenAI) and it felt like it was tough to get there without spending an arm and a leg on GPUs alone. But I think my plan is now to start with the NAS build, which I should be able to accomplish without spending a crazy amount and then building out iteratively from there. As you say, I'd prefer to screw up and make a $500 mistake vs. a multiple thousand dollar one. Thanks for the sanity check!
-
[email protected]replied to [email protected] last edited by
Thank you so much for all of this! I think you're definitely right that probably starting smaller and trying a few things out is more sensible. At least for now I think I am going to focus on putting something together for the lower-hanging fruit by focusing on the NAS build first and then build up to local AI once I have something stable (but I'll definitely be keeping an eye out for GPU deals in the meantime, so thanks for mentioning the B580 variant, it wasn't on my radar at all as an option). But I think the thread has definitely given me confidence that splitting things out that way makes sense as a strategy (I had been concerned when I first wrote it out that not planning out everything all at once was going to cause me to miss some major efficiency, but I feel like it turns out that self-hosting is more like gardening than I thought in that it sort of seems to grow organically with one's interest and resources over time; sort of sounds obvious in retrospect, but I was definitely approaching this more rigidly initially). And thank you for the HDD rec! I think the Exos are the level above the Ironwolf Pro I mentioned, so will definitely consider them (especially if they come back online for a reasonable price at serverpartdeals or elsewhere). Just out of curiosity, what are you using for admin on your MC server? I had heard of Pterodactyl previously, but another commenter mentioned CraftyController as a bit easier to work with. Thank you again for writing all of this up, it's super helpful!
-
[email protected]replied to [email protected] last edited by
This is definitely good advice. I tend to run my laptops into the ground before I replace them, but a lot of the feedback here has made me think experimenting with something much less expensive first is probably the right move instead of trying to do everything all at once (so that when I inevitably screw up, it at least won't be a $4k screw up.) But thanks for the sanity check!
-
[email protected]replied to [email protected] last edited by
Wow, that sounds amazing! I think that GPU alone would probably exceed my budget for the whole build lol. Thanks for sharing!