Hi, Does anybody use a GlusterFS (3.1.1) system for hosting user's home folders or other forms of collections of small files? I'm having major issues with performance which I'm pretty sure comes down to the hardware - but I would appreciate more feedback / ideas / comments. While I had been working on a Gluster two-node system to hold virtual machine images, I have been pressed into moving very suddenly and very urgently the home folders of the users (of which 30-40 are active by day) into this system (suspicious hardware on the previous system; this system was a single NFS server with LVM over 2 units: raid5, 5 x 300GB). Unfortunately, the hardware for the Gluster system was not really for production use and the result for our users is an awful desktop experience. One system is dual-xeon / 8GB ram, the other is an AMD lower power CPU (dual core) with only 1GB ram. Each system has two SATA hard disks: a 300GB boot disk and a 2TB hard drive for gluster (no raid yet). Worst(?) - the 2TB hard disk is a WD Caviar "GREEN" (5400 RPM). Gigabit networking... The software is Linux, Ubuntu 10.04 server, 64bit edition. Gluster is running within an LXC container as I had considered putting some other small non-resource-intensive services in other LXC containers, but haven't yet put anything else on the system. The desktops are Ubuntu 10.04 desktop edition, i386. I used the Debian i386 packages for installation, and so am mounting gluster directly. The home folders are on a central file server, so almost everything users open ends up there. Kerberos + LDAP keeps users separate. NFS was not really an option (very frequently it would say "NFS server not responding" and shortly after say "NFS server ok" -- but the timeouts were simply too frequent). Gluster I skipped updating internal DNS (since our authoritative DNS server is inside a virtual machine) and put in the hosts file of every desktop / server the addresses of the two gluster servers. Reverse DNS is set up for the internal network, so the the BE/Gluster servers can make proper name resolution of the IP addresses. Performance stats seem to show hardware limits, but I am left with a bit of doubt. It seems that the system is working in bursts - so maybe some system caches could also improve the system? To date I've only changed "performance.cache-refresh-timeout" to be 10 seconds. This seemed to help, but only a little and only for a short while. Is there any more software optimizations/changes I could make? I'm planning on: * Almost immediately: - Replace the caviar green hard disks with caviar black * Very soon: - Putting aside the low power PC, move to a 1 x Xeon + 4GB ram instead. - Changing to use (software) raid 5 over multiple disks What would your opinions be: would this clear up the performance issues for the desktops? Would I get more or less performance by creating a virtual machine (one big flat file on the Gluster storage) to store and export the files? Thinking longer term, is there any way to take advantage of the local disks (standard for our PCs: 80GB HDD) for extra performance in the local system (local file system cache)? Yet preserve "mobility" for the users? Some output from the system: Load: ================ localadmin at BE1:~$ uptime 12:44:34 up 1 day, 19:58, 4 users, load average: 3.04, 4.34, 5.14 localadmin at BE1:~$ uptime 12:49:42 up 1 day, 20:03, 4 users, load average: 8.46, 6.77, 6.00 ================ Memory availability: ================ top - 12:57:53 up 1 day, 20:12, 4 users, load average: 7.92, 7.04, 6.53 Tasks: 141 total, 1 running, 140 sleeping, 0 stopped, 0 zombie Cpu(s): 5.6%us, 1.8%sy, 0.0%ni, 10.0%id, 81.3%wa, 0.3%hi, 1.0%si, 0.0%st Mem: 957948k total, 926428k used, 31520k free, 43804k buffers Swap: 2805752k total, 796k used, 2804956k free, 618824k cached ================ iostat -x 1 ================ Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb 0.00 35.00 2.00 6.00 16.00 320.00 42.00 0.30 36.25 23.75 19.00 avg-cpu: %user %nice %system %iowait %steal %idle 1.46 0.00 5.83 26.70 0.00 66.02 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb 0.00 40.00 1.00 39.00 8.00 608.00 15.40 0.78 16.50 19.25 77.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.00 0.00 2.01 45.73 0.00 52.26 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb 0.00 0.00 0.00 3.00 0.00 16.00 5.33 0.98 370.00 326.67 98.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.98 0.00 2.44 0.00 0.00 96.59 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 ================ All comments and ideas are quite welcome! Andrew