Hacking Together A New Home Server
2020-08-25 23:38 - General
Background
Some time last year the server I keep in my Mom's basement (for my remote backups and a few small things for her) went kaput. I don't know how or why because it's my remote server and I don't have easy access to fix it. It was almost exactly a decade old, and not very powerful when it was new (to me, refurbished) and it just wasn't booting at all. It was time to replace it. But since it's primarily a backup server, time was not of the essence. I took the time to figure out how to rebuild it around a Supermicro server-class motherboard, with built in IPMI. A technology that allows completely hands-off remote management of every aspect of the server. Great for a remote backup machine! I slowly bought parts, assembled them all at home and then once it was working well, brought it out to my Mom's at my next visit.
Point being: it introduced me to building a server with real server-class hardware, with built in remote management. I got a taste for it.
Separately I've told the story of how I broke and rebuilt the main server I keep at home, the primary source of the backups to be sent remotely. More recently I figured out that the first-generation AMD Ryzen chip that was in it was not so hot. So much so that AMD accepted an RMA even a couple years later, known design defect. And replaced it. With a more powerful chip. Which seemed nice at first. Turns out, however, it's not. In addition to the problem that was bad enough (not really, the workaround was easy and the occurrence was predictable and rare) to warrant a return and replacement for free, there's a worse problem with these early Ryzen chips: random freezes and reboots. The popular internet theories tend to revolve around bad power management or bad early motherboard support. Or both? This problem happens, ironically enough at idle as opposed to at high load (as the bad-enough-for-return problem did). Which usually means that after a few hours of sitting around doing almost nothing (e.g. overnight!) the machine will freeze, crash, or reboot.
This wasn't obvious right at first, but within a couple weeks became a real problem, every few days I'd find my server hadn't really been running for a while. I did internet research and tried to figure it out. I updated the BIOS which actually removed the primary option that used to help. At one point crashes every few hours were common. I tried buying a brand new motherboard, with a newer chipset and theoretically better support. It was worse: it wouldn't even stay running long enough to boot regularly, much less last for a few hours after that. (So I returned it, which I generally hate to do.)
Long story slightly shorter: I could get it stable, but only by completely turning off power management ("Cool'n'Quiet") in the BIOS and running a busy-loop job just to keep the CPU slightly above idle. (Again, internet theories: it's something about waking up from a completely idle low-power state that fails, so keeping the CPU busy prevents it from going into that low power state.) Just not good enough. And that nice "real server" I had built not to long ago was nagging the back of my mind...
Story
So I had a server that at least didn't crash regularly, but I was no longer happy with it. I decided it was time to build a second "real server". I took more time and put in more planning though, this round. I was mostly set on Supermicro again: first, using the same platform would make things easier in the long haul: same tools and features on both servers. Second: they're pretty common. There's enough used stuff out there to be nice and affordable, too. I looked at a wide range of boards, spanning the X9 (like the one I built last year) to the X11 (newest, as far as I can tell) model lines. The newer models never seemed to hit a better price/performance trade-off so I started looking more closely at the X9 boards. They didn't, as I first suspected, all build around the same class of processors, they varied. Within that bunch, the LGA2011 type socket chips seemed the best, and I made a spreadsheet with a chart:
Which shows the CPU Mark rating for all the LGA2011 processors (that seemed commonly available) compatible with the X9 boards I could find charted against the price (lowest stable/believable price on eBay). There's a few excursions above and below the trend line, but not really far.
What I had to decide was how much performance do I really want? I spent around two years with a Ryzen 5 1600, CPU Mark 12455 (plenty for my needs) and a few months with a Ryzen 7 1700, CPU Mark 14566 (even more!). Very few available processors were in that range. But there are dual processor boards available!
I began trolling eBay for a good deal on either a single powerful chip or a dual processor board that could house two more inexpensive chips. Around this time I learned that there's the extremely standard ATX motherboard size, a larger E-ATX size, and a yet-larger EE-ATX size. The latter is extremely non-standard. I was looking at the X9DRi-LNF4+ board for a while, but it's that EE-ATX size which is really Supermicro-specific and only works (well) in Supermicro's own (expensive) cases. Next I found the very similar X9DRi-F &emdash; same line, slightly less feature packed, and just "E"-ATX. I looked around for a while and found a real auction (not buy-it-now) item on eBay, and I won it! First part selected!
What followed was building the machine around that: picking the CPUs to go with it: two Intel Xeon E5-2650 v2s at 10006 CPU Mark each, 64 GB of ECC memory (4 16GB sticks), a brand new power supply (with two CPU power connectors) and other miscellaneous parts.
Aside: part of the reason I tended towards the dual processor board was the memory. In last year's build I accidentally bought "registered" ECC memory, or RDIMMs. Support for registered memory is mostly limited to servers and a few other "high end" computers. It's not in as much demand, so it can end up cheaper. (That's how I picked it at first: it said ECC and everything else seemed OK and it was the cheapest.) This time I already had the motherboard and I knew it supported registered memory (I think all dual processor boards do, they tend to have more RAM slots and the benefit of the registered memory is it can consume less power (I think), and work better when there are more individual sticks of memory sharing it all.) Either way this time it didn't matter, the prices were all very close, registered or not. Oh well.
Plus a case to put it all in. With hindsight I can now say that it turns out E-ATX isn't a standard, either. At least not to the degree that ATX is. If you buy a motherboard and a case that both say they work with ATX, they're virtually guaranteed to work together. Not so with E-ATX. Not only did I pick this unusual motherboard size, I also have several (redundant) disks that I want to put in the case. And despite picking server class hardware, I want a PC-like tower. All the real (rack mount) server cases I've ever seen have loud fans in them, and this is going to be in my studio apartment, where I sleep.
I looked around for a while and selected the Antec P101 Silent. Looks great. Eight side-load disk trays &emdash; enough to hold all my disks and makes for easy install and replacement. Motherboard compatibility list says "E-ATX". Great. I ordered it. I got it. As soon as I saw it, it became clear that in order to fit an E-ATX motherboard, you need to remove six of the eight drive trays. Not great.
I had patiently and slowly gathered all my parts, but now they didn't all work (together). The seller didn't reveal this fact and neither did the manufacturer, in any part of their pages or the manual. I tried to return it and was initially quoted a $47 return shipping fee, for a $110 or so item. My heart sank. After some thinking, I got the tiniest bit creative with how I filled out the return request form and got it to agree for the seller to pay return shipping. Along the way I selected "store credit" instead of refund. I don't know if that helped, but I felt like it did.
Then UPS and Newegg conspired to hold my refund hostage for over four weeks (July 16 to August 12) before finally accepting and processing the return. By then the replacement model I selected had gone up in price by over $25. Sigh. But I'd been waiting a long time and had already locked up over $100 in store credit so I just went with it. I got a Fractal Design XL R2, which in the meantime I had reached out to confirm it will really fit my motherboard. (Mostly! See below.)
The Build
The last detail is that by now I had learned how not-quite-standard E-ATX was. I had found other hobbyists like me, building machines around similar Supermicro boards, and wanting to use normal PC cases. Some but not all of the mounting holes line up, and in one of the discussions I read somebody called out how important it is, in a vertical tower case, to support the weight of the CPU coolers well. So all this time waiting, I had done nothing, because step one was to custom mount the motherboard into the case, and I needed to wait for the case.
The board has ten mounting holes. Six of them lined up with factory holes. Four I drilled out myself: Installed the bare board with the six good mounting points, mark where the remaining four holes should go, and then simply drill a hole just big enough to allow the standoff's threads to fit through. Screw the standoff into a standalone nut, rather than the threads built into the case at the factory. Worked great, just had to spend some quality time with a file to get two of the four holes to line up with the rest. Look closely and you can see the four shiny metal nuts in the picture above, on the black background of the case. Look even more closely and you can see five of the much smaller shiny points where the standoffs fit through the factory holes. (The sixth is obscured by the cabling.)
With the motherboard now securely installed (the screw heads are black and almost disappear in the photo), next was to install the CPUs. Another tiny saga:
The processors I bought were these, as pictured. I thought that blue thing was some sort of special shipping/protective case. So I biased towards listings including them. It's not. Everything's fine now, but I had to wince as I yanked the blue plastic piece off, it was glued on. It's for some (maybe vendor-specific, like for HP?) slightly more specific variant of the CPU socket, that I don't have. Ok. CPUs installed. Good. Next is the CPU coolers.
They didn't fit.
I got a pair of CPU coolers made to fit several different CPU sockets, you're meant to pick and choose the right parts (among several available) to fit the socket you're using. There's some pieces labeled either "Intel" or "AMD". And I've got an Intel CPU so the choice seemed obvious, but it was way too big. And the AMD one had two pair of holes, either a little too small or just a little smaller than that. I found the data sheet, which clearly calls out a mounting hole spacing of 80 mm square. Mine aren't square. Turns out there's more than one LGA2011 socket, and I've got the Narrow ILM ("Independent Loading Mechanism"). Which I learned from a page using a very similar board as an example: a smaller form factor socket so that (e.g.) two can fit on a single board!
I can solve this. The AMD piece was almost the right size. So after a little time with a drill and a Dremel, I had added my own extra holes. Each piece ends up tilted just a bit, but by keeping the offset symmetric everything still lines up well enough to work. Phew!
So I'm done, right? Plug in the memory and turn it on! And it doesn't boot. Does display a code on the screen. Which indicates bad memory. After extensive testing I found that one of the four sticks of RAM I bought is no good. At least enough to prevent booting the machine: the other three each as the only installed memory work fine. The fourth: no go. (Still waiting for that replacement, the seller has been very non-helpful, but eBay claims a replacement is on the way. Return shipping is .. unclear.)
Done
Mostly. Here's the new and old machines next to each other. It's hard to capture in one picture, but (to fit the bigger motherboard) the new one is a few inches taller and deeper (and happens to be subtly wider, too).
I'm waiting for the last of the memory to arrive (either way it's a big upgrade over what I had before). And there's an additional "Storage Controller Unit" with four more SATA connections that I'd like to use but haven't figured out yet. But otherwise it's set up and working! The story is long without the software-based issues I had switching from an AMD to Intel processor. But it's basically done!
Cost
As mentioned, this was mostly assembled from used parts. I spent (including tax and shipping in-line):
Item | Condition | Amount |
---|---|---|
Fractal Design Define XL R2 (case) | New | $189.98 |
Supermicro X9DRI-F motherboard | Used | $160.59 |
EVGA SuperNOVA 750 GA 220-GA-0750-X1 power supply | New | $151.40 |
2x Intel Xeon E5-2650 v2 CPUs | Used | $114.13 |
4x 16GB ECC DDR3 DIMM | Used | $100.17 |
2x RAIJINTEK AIDOS CPU Cooler | New | $37.00 |
And I transplanted my existing disks in. Total: $753.27. I'm happy with this; buying a standalone remote management tool is possible, but for some reason they seem to start in the $400 range, used. I got that plus a nicer server upgraded in just about every way.