best practices? 500+ win/mac computers, gluster, samba, new SAN, new hardware

dante at lorenso.com (D. Dante Lorenso) · Sat, 28 Jan 2012 17:31:28 -0600

All,

I'm in the market for some hardware.  Planning to put together a Gluster 
"cloud" to make our lab storage more performant and reliable.

Thinking about buying 8 servers with 4 x 2TB 7200 rpm SATA drives 
(expandable to 8 drives).  Each server will have 8 network ports and 
will be connected to a SAN switch using 4 ports link aggregated and 
connected to a LAN switch using the other 4 ports aggregated.  The 
servers will run CentOS 6.2 Linux.  The LAN side will run Samba and 
export the network shares, and the SAN side will run Gluster daemon.

With 8 machines and 4 ports for SAN each, I need 32 ports total.  I'm 
thinking a 48 port switch would work well as a SAN back-end switch 
giving me left over space to add iSCSI devices and backup servers which 
need to hook into the SAN.

On a budget, I'm planning to use custom-built Supermicro servers with a 
DLink 48-port Layer 2 switch for the SAN.

I've already put together a test gluster setup using some virtual 
machines and it seems very good.  As I move to designing the production 
configuration, I'm wondering if there are best practices for how to set 
up shares and bricks, etc.  Right now I'm thinking something like this:

1) Where should bricks be stored?  I'd like bricks to stay out of sight 
so admins are not tempted to accidentally write data directly to the 
brick instead of the gluster mount.  Something like:

	/brick/[brick dir name|brick mount dir]

2) Where should the glusterFS be mounted on the box?  I'm thinking of 
using either /mnt or creating a new /gluster directory for the mount points:

	/mnt/[gluster share]
or
	/gluster/[gluster share]

3) Common Samba configurations?  In my virtual machine tests, I had 
problems mounting my Samba shares on Mac and Windows.  As I starting 
configuring, it turned out I needed a bunch of samba-specific rules to 
fix .DS_Store files on mac, adjust directory mode and file modes, set 
socket options, and general Active Directory integration stuff.  Is 
there a best-practices for smb.conf files when used with Gluster?

Over time, I'm hoping to go through my network and replace all Windows 
storage servers with Samba whether I'm using Gluster or not.  If any of 
you have pointers on this, it'd be great.

4) Performance tuning.  So far, I've discovered using dd and iperf to 
debug my transfer rates.  I use dd to test raw speed of the underlying 
disks (should I use RAID 0, RAID 1, RAID 5 ?), then I use iperf to test 
the speed of the network (make sure I'm getting the bandwidth I expect). 
  Finally, I can use dd again to test my read and write speed to and 
from the gluster mount point.  If all looks good, I move to testing 
transfers all the way to a Windows 7 box that mounts the storage servers 
over Samba.  Then, I test everything like this:

	win7 -> network -> samba -> gluster -> brick -> ext4 -> sata hdd

5) Preferred striping or layout?  I want fast, good, and cheap!
http://en.wikipedia.org/wiki/Project_triangle  Since I already know the 
hardware, my costs are pretty much determined.  Next, I want to get the 
most Good and Fast from that cost.  I'm thinking RAID 10 ... but at the 
network level.  Perhaps if my drives on each of the 8 servers are RAID 
0, then I can use "replicate 2" through gluster and get the "RAID 1" 
equivalent.  I think using replicate 2 in gluster will 1/2 my network 
write/read speed, though.  If instead I used RAID 1 for my hardware and 
no replicate in Gluster, then I get a RAID 0-1 overall, but can not 
afford to lose one entire storage server.

For a network lab of 500+ computers with Active Directory, user profiles 
stored on LAN as well as desktops and redirected Movies, Photos, My 
Documents to network storage as well, I need performance and 
reliability.  Any papers out there showing how another large university 
has achieved this using Gluster?

TIA!

-- Dante

D. Dante Lorenso
dante at lorenso.com