File Server Builder's Guide
by Zach Throckmorton on September 4, 2011 3:30 PM ESTWhat is a file server?
Essentially, a file server is a computer that stores files, is attached to a network, and provides shared access of those files to multiple workstation computers. File servers do not perform computational tasks - that is, they do not run programs for client machines. Furthermore, they do not provide dynamic content like a web server. Still further, file servers are not like database servers in that the former do not provide access to a shared database whereas the latter do. File servers provide access to static files via a local intranet through Windows or Unix protocols as well as over the internet through file transfer or hypertext transfer protocols (FTP and HTTP).
What can you do with a file server?
The primary function of a file server is storage. For the home user, one central storage location can increase overall computing efficiency and reduce overall computing cost. By placing all of your important files in a single location, you do not need to worry about different versions of files you're actively working on, wasting disk space by having multiple copies of less-than-important files scattered on different systems, backing up the right files onto the right backup storage medium from the right computer, making sure every PC in your home has access to the appropriate files, and so on.
From a system builder's perspective, a file server can also liberate your various workstation computers from having to accommodate multiple hard drives, and decrease overall hard drive expenditures. With the rise of SSDs, which offer tremendous performance at a high cost per GB, a file server can free workstations from the performance shackles of platter-based disks - an especially useful consideration for laptops and netbooks, where the small capacity of an SSD is often a deal breaker since these mobile computers usually can house only one drive.
A dedicated file server allows every user in a home - whether they're at home or on the road - to access every file they might need, regardless of which particular device they might be using at any given time. Dedicated file servers also allow you to share your files with friends and coworkers - simply provide them with a URL, a login name and password, and specify what content they can access. For example, maybe you'd like to share your kids' camp photos with the in-laws - but your cloud storage capacity won't fit all of those photos plus all of the other stuff you have stored in your cloud drive locker. Maybe you'd like to share sensitive information with a colleague that you'd rather not upload to a server owned by Amazon or some other third party, but the files are too big to email. Or maybe you'd simply like to access your 200GB library of MP3s while you're holed up in a hotel on business with nothing but your 60GB SSD-based netbook. These few examples are really only the tip of the iceberg when it comes to the utility of a file server.
That said, there are alternatives to a file server for all of these needs. You could dump all of your photos onto a flash drive and give them to the in-laws the next time you see them - but you have to do this every time you want to share more photos - and who knows if you'll get your flash drives back? You could mail a DVD-R to your colleague - but perhaps a DVD-R's ~4GB capacity is insufficient, and snail mail takes days if not weeks to be delivered. If you're on the road, you could just bring along your portable external hard drive - which takes up space, and can be lost or stolen. A file server is a simple, singular solution to all of these problems. Home file servers do not require enterprise-grade hardware and can be very affordable. They can also be made from power-sipping components that won't spike your electrical bill.
What considerations are important in building a file server?
Because the primary role of a file server is storage, this is the most important aspect to think about. How much storage space do you need? Do you want to share 50GB of photos taken on a point and shoot digital camera? 500GB of music? 2TB of movie DVD ISOs? 30TB of mixed media and work-related files? Also, at what rate are your storage demands growing, and how easily do you want to be able to expand your file server?
How easily do you want to be able to administer your files? Many of the more powerful file server operating systems are unfortunately not particularly easy to run for the non-IT professional. However, there are file server OS's that are easy to run. What about being able to recover your files in the event of catastrophe? Placing your files in one computer is tantamount to putting all of your eggs in one basket, which can be risky. What about security? Anything on any sort of network is vulnerable to intrusion. While this guide answers all of these questions, it is aimed at home users and therefore necessarily makes some sacrifices to storage space, administration capabilities, recoverability, and security - simply because home users typically can neither afford nor require professional-grade file server solutions.
Why build a file server instead of using NAS?
Simply put, a NAS (networked attached storage) device is a computer appliance. It is built specifically to provide network-accessible storage. NAS devices typically offer easier administration than file servers (some are a few mouse clicks away from plug and play operability), but are often limited by proprietary software, and are neither as capacious nor as expandable as a dedicated file server. Further, higher-end NAS devices that can house as many hard drives as some of the builds outlined in this guide are more expensive than the file server alternative. Finally, because they are designed with only one purpose in mind, they are not as flexible as a file server, which in a multi-system home, might need to be co-opted into a basic workstation at a later point in time. That said, while NAS devices are outside the scope of this guide, they're worth investigating if you're not already familiar with them.
This guide is laid out differently than my previous builder's guides in that rather than detailing specific systems at specific price points capable of performing specific tasks, it instead discusses options for operating systems and types of components and how these different options are best suited to addressing different needs. That is, maybe you need a lot of storage space but you're not particularly concerned about backups. Or perhaps you don't need much storage space at all but want to use a very straightforward file server operating system. By mixing and matching recommendations to suit your needs, hopefully you'll be able to construct a file server with which you'll be pleased!
152 Comments
View All Comments
HMTK - Monday, September 5, 2011 - link
Inferior as in PITA for rebuilds and stuff like that. On my little Proliant Microserver I use the onboard RAID because I'm too cheap to buy something decent and it's only my backup machine (and domain controller, DHCP, DNS server) but for lots of really important data I'd look for a true RAID card with an XOR processor and some kind of battery protection: on the card or a UPS.fackamato - Tuesday, September 6, 2011 - link
I've used Linux MD software RAID for 2 years now, running 7x 2TB 5400 rpm "green" drives, and never had an issue. (except one Samsung drive which died after 6 months).This is on an Atom system. It took roughly 24h to rebuild to the new drive (CPU limited of course), while the server was happily playing videos in XBMC.
Sivar - Tuesday, September 6, 2011 - link
This is not true in my experience.Hardware RAID cards are far, far more trouble than software RAID when using non-enterprise drives.
The reason:
Nearly all hard drives have read errors, sometimes frequently.
This usually isn't a big deal: The hard drive will just re-read the same area of the drive over and over until it gets the data it needs, then probably mark the trouble spot as bad, remapping it to spare area.
The problem is that consumer hard drives are happy to spend a LONG time rereading the trouble spot. Far longer than most hardware RAID cards need to decide the drive is not responding and drop it -- a perfectly good drive.
For "enterprise" SATA drives, often the *only* difference, besides price, is that enterprise drives have a firmware flag set to limit their error recovery time, preventing them from dropping unless they have a real problem. Look up "TLER" for more information.
Hardware RAID cards generally assume they are using enterprise drives. With RAID software it varies, but in Linux and Windows Server 2008R2 at least, I've never had a good drive drop. This isn't to say it can't happen, of course.
------------------------------
For what it's worth, I recommend Samsung drives for home file servers. The 2TB Samsung F4 has been excellent. Sadly, Samsung is selling its HDD business.
I expressly do not recommend the Western Digital GP (Green) series, unless you can older models before TLER was expressly disabled in the firmware (even as an option).
Havor - Sunday, September 4, 2011 - link
HighPoint RocketRAID 2680 SGL PCI-Express x4 SATA / SAS (Serial Attached SCSI) Controller CardIn stock.
Now: $99.00
http://www.newegg.com/Product/Product.aspx?Item=N8...
Screw software raid, and then there are many card with more options like online array expansion.
Ratman6161 - Tuesday, September 6, 2011 - link
For home use, a lot/most people are probably not going to build a file server out of all new components. We are mostly recycling old stuff. My file server is typically whatever my old desktop system was. So when I built my new i7-2600K system, my old Core 2 Quad 6600 desktop system became my new server. But...the old P35 motherboard in this system doesn't have RAID and has only 4 SATA ports. It does have an old IDE Port. So it got my old IDE CD-ROM, and three hard drives that were whatever I had laying around. Had I wanted RAID though, I would probably get a card.Also, as to OS; A lot of people for use as a home file server are not going to need ANY "server" os. If you just need to share files between a couple of people, any OS you might run on that machine is going to give you the ability to do that. Another consideration is that a lot of services and utilities have special "server" versions that will cost you more. Example: I use Mozy for cloud backup but if I tried to do that on a Windows Server, it would detect that it was a server and want me to upgrade to the Mozy Pro product which costs more. So by running the "server" on an old copy of Windows XP, I get around that issue. Unless you really need the functionality for something, I'd steer clear of an actual "server" OS.
alpha754293 - Tuesday, September 6, 2011 - link
@Rick83"MY RAID card recommendation is a mainboard with as many SATA ports as possible, and screw the RAID card."
I think that's somewhat of a gross overstatement. And here's why:
It depends on what you're going to be building your file server for and how much data you anticipate on putting on it, and how important is that data? LIke would it be a big deal if you lost all of it? Some of it? A weeks worth? A day's worth? (i.e. how fault tolerant ARE you?)
For most home users, that's likely going to be like pictures, music, and videos. With 3 TB drives at about $120 a pop (upwards of $170 a pop), do you really NEED a dedicated file server? You can probably just set up an older, low-powered machine with a Windows share and that's about it.
@Rick83/PCTC2
I think that when you're talking about rebuild rates, it depends on what RAID level you were running. Right now, I've got a 27 TB RAID5 server (30 TB raw, 10 * 3TB, 7200 rpm Hitachi SATA-3 on Areca ARC-1230 12-port SATA-II PCIe x8 RAID HBA); and it was going to take 24 hours using 80% background initialization or 10 hours with foreground initialization. So I would imagine that if I had to rebuild the entire 27 TB array; it's going to take a while.
re: SW vs. HW RAID
I've had experience with both. First is onboard SAS RAID (LSI1068E) then ZFS on 16*500 GB Hitachi 7200 rpm SATA on Adaptec 21610 (16-port SATA RAID HBA), and now my new system. Each has it's merits.
SW RAID - pros:
It's cheap. It's usually relatively easy to set up. They work reasonably well (most people probably won't be able to practically tell the difference in performance). It's cheap.
SW RAID - cons:
As I've experienced, twice; if you don't have backups, you can be royally screwed. Unless you've actually TRIED transplanting a SW RAID array, it SOUNDS easy, but it's almost never is. A lot of the times, there are a LOT of things that happen/running in the background that's transparent to the end user so if you tried to transplant it, it doesn't always work. And if you've ever tried transplanting a Windows install (even without RAID); you'll know that.
There's like the target, the LUN, and a bunch of other things that tell the system about the SW RAID array.
It's the same with ZFS. In fact, ZFS is maybe a little bit worse because I think there was like a 56-character tag that each hard drive gets as a unique ID. If you pulled a drive out from one of the slots and swapped it with another, haha...watch ZFS FREAK out. Kernel panics are sooo "rampant" that they had a page that told you how to clear the ZFS pool cache to stop the endless kernel panic (white screen of death) loop. And then once you're back up and running, you had to remount the ZFS pool. Scrub it, to make sure no errors, and then you're back up.
Even Sun's own premium support says that in the event of a catastrophic failure with SW RAID, restore your data from back-ups. And if that server WAS your backup server -- well...you're SOL'd. (Had that happen to me TWICE because I didn't export and down the drives before pulling them out.)
So that's that. (Try transplanting a Windows SW RAID....haha...I dare you.) And if you transplanted a single Windows install enough times, eventually you'll fully corrupt the OS. It REALLLY hates it when you do that.
HW RAID - pros:
Usually it's a lot more resilent. A lot of them have memory caches and some of them even have backup battery modules that help store the write intent operations in the event of a power failure so that at next power-up, it will complete the replay.* (*where/when supported). It's to prevent data corruption in the event that say you are in the middle of copying something onto the server, but then the power dies. It's more important with automated write operations, but since most people kinda slowly pick and choose what they put on the server anyways, that's usually not too bad. You might remember where it left off and pick it up from there manually.
It's usually REALLY REALLY fast because it doesn't have OS overhead.
ZFS was a bit of an exception because it waits until a buffer of operations is full before it actually executes the disk. So, you can get a bunch of 175 MB/s bursts (onto a single 2.5" Fujitsu 73 GB 10krpm SAS drive), but your clients might still be reporting 40 MB/s. On newer processors, it effectively was idle. On an old Duron 1800, it would register 14% CPU load doing the same thing.
HW RAID - cons:
Cost. Yes, the controllers are expensive. But you can also get some older systems/boards with onboard (HW RAID) (like LSI based controllers), but they work.
With a PCIe x8 RAID HBA, even with PCIe 1.0 slots, each lane is 2 Gbps (250 MB/s) in each direction. So an 8-lane PCIe 1.0 card can do 16 Gbps (2 GB/s) or 32 Gbps (4 GB/s). SATA-3 is only good to 6 Gbps (750 MB/s including overhead). The highest I'm hitting with my new 27 TB server is just shy of 800 MB/s mark. Sustained read is 381 MB/s (limited by SATA-II connector interface). It's the fastest you can get without PCIe SSD cards. (And as far as I know, you CAN'T RAID the PCIe SSD cards. Not yet anyways.)
Brutalizer - Friday, September 9, 2011 - link
It doesnt sound like I have the same experience of ZFS as you.For instance, your hw-raid ARECA card, is it in JBOD mode? You know that hw-raid cards screw ZFS seriously?
I have pulled disks and replaced them without problems, you claim you had problems? I have never heard of such problems.
I have also pulled out every disk, and inserted them again in other slots and everything worked fine. No problem. It helps to do a "zpool export" and "import" also.
I dont understand all your problems with ZFS? Something is wrong, you should be able to pull out disks and replace them without problems. ZFS is designed for that. I dont understand why you dont succeed.
plonk420 - Sunday, September 4, 2011 - link
friend has had good luck with a $100ish 8xSATAII PCI-X Supermicro card (no raid). he uses lvm in ubuntu server. i think they have some PCI-e cards in the same price range, too.i got a cheapish server-grade card WITH raid (i had to do some heavy research to see if it was compatible with linux), however it seems there's no SMART monitoring on it (at least in the drive manager GUI; i'm a wuss, obviously).
nexox - Wednesday, September 7, 2011 - link
Well, there are about a million replies here, but I think I've got some information that others have missed:1) Motherboard SATA controllers generally suck. They're just no good. I don't know why this site insists on benchmarking SSDs with them. They tend to be slow and handle errors poorly. Yes, I've tested this a fair amount.
2) Hardware RAID has it's positives and negatives, but generally it's not necessary, at least in Linux with mdraid - I can't speak for Windows.
So what do you do with these facts? You get a quality Host Bust Adaptor (HBA.) These cards generally provide basic raids (0,1,) but mostly they just give you extra SAS/SATA ports, with decent hardware. I personally like the LSI HBAs (since LSI bought most of the other storage controller companies,) which come in 3gbit and 6gbit SAS/SATA, on PCI-Express x4 and x8, with anywhere from 4 to 16 ports. 8 lanes of PCI-Express 2.0 will support about 4GB/s read, which should be enough. And yes, SAS controllers are compatible with SATA devices.
Get yourself an LSI card for your storage drives, use on board SATA for your boot drives (software raid1,) and run software raid5 for storage.
Of course this means you can't use an Atom board, since they generally don't have PCI-e, and even the Brazos boards only offer PCI-e 4x (even if the slots look like a 16x.)
For some reason SAS HBAs are some kind of secret, but they're really the way to go for a reliable, cheap(ish) system. I have a $550 (at the time) 8 port hardware raid card, which is awesome (Managed to read from a degraded 8 disk raid5, cpu limited at 550MB/s, on relatively old and slow 1TB drives, which isn't going to happen with software raid,) but when I build my next server (or cluster - google ceph) I will be going with sofware raid on a SAS HBA.
marcus77 - Saturday, October 6, 2012 - link
I would recommend you euroNAS http://www.euronas.com as OS because it would provide you more flexibility (you can decide which hw to use and can upgrade it easely).Raid controllers don't always make sense - especially when it comes to recovery (multiple drive failures) software raid is much more powerful than most raid controllers.
If you wish to use many drives you will need an additional controller - LSI makes pretty good HBAs - they don't provide raid functionality but have many ports for the drives. You could use it in combination with software raid. http://www.lsi.com/products/storagecomponents/Page...
If you are looking for a real HW raid controller - I would recommend Adaptec - they have a very good linux support which is mostly used with storage servers