I'm rolling out 4 lots of 6-node GlusterFS setups for my employer. Each node is ~33TB of RAID6 backed storage (16x 3TB SATA disks in RAID6 with a hot spare hanging off an LSI controller, with 2x SSDs configured for caching), and Gluster is configured in distribute-replicate. Each cluster is 200TB of raw space, 100TB usable after replication. When complete, there will be 4 of these clusters. Nodes are configured as XFS with 512byte inodes, running a fully patched CentOS6 and Gluster 3.3.1. Each node has a 6 core Xeon processor (with HT for 12 threads) with 32GB of RAM. Each node runs 2x 10Gbps Ethernet over fiber in a bonded configuration (single IP address per node) for a full 20Gbits per node. GlusterFS FUSE performance under Linux is great (clients run a mix of Ubuntu 12.04 LTS for workstations and CentOS6 for servers). Samba performance back to Windows 7 clients is great. NFS performance via both Gluster's userspace setup as well as CentOS6's native NFS4 kernel server are great to most other systems where we can't get the Gluster FUSE client loaded (large industry-specific Linux boxes that are provided by vendors as a "black box" solution, and only allow limited access via NFS or SMB/CIFS). All testing so far under those conditions proves orders of magnitude faster throughput than our existing single NAS solutions. MacOSX Finder performance is a problem, however. There's a huge bug in MacOSX itself that prevents using NFS at all (discussions on other mailing lists suggest it occurred somewhere around 10.6, and continues through into 10.7 and 10.8). Mounting via SMB under OSX is more stable than NFS, however in folders with a large amount of files, Finder goes looking for a corresponding Apple Resource Fork file (for every "filename.ext", it looks for a "._filename.ext"). Running tcpdump and wireshark on the Gluster nodes shows that the resulting "FILE_NOT_FOUND" error back to the client takes a very long time. Configuring a single node as a pure NAS with the same software (but no Gluster implementation) is lightening fast. As soon as GlusterFS comes in to play, reporting of each "FILE_NOT_FOUND" slows down the process dramatically, causing a directory with ~1000 images in it to take well over 5 minutes to display the contents in MacOSX finder. This problem is resolved somewhat by switching to AFP (via Netatalk loaded on the GlusterFS nodes), but it has it's own problems unique to that protocol, and I'd rather stick to GlusterFS-FUSE, NFS or SMB in that order of preference. It's worth noting that through the terminal, these problems don't exist. Mounting via SMB, browsing to the volume in terminal and running "ls" or "find" style commands retrieve file listings at a similar speed to Linux and Windows. The problem is limited to clients using Finder to browse directories, and again particularly ones with a large number of files that don't have matching Apple Resource Fork files. (Of note, creating empty files of the matching "._filename.ext" format solves the performance problem, but litters our filestores with millions of empty files, which we don't want). I understand the problem is not strictly Gluster's issue. Finder is looking for a heck of a lot of files that don't exist (which is a pretty silly design), and it tends to occur only with Samba re-exporting GlusterFS volumes that we can see. And likewise Apple's NFS bug that has now been in existence across three releases of their OS is pretty horrible. But hopefully I can at least describe the problem and prompt some testing by others. I haven't had a chance to test a MacOSX FUSE client due to time constraints, but that would at least answer the question if the problem is Gluster's lag in reporting of files not found, or Samba's. -Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20121220/d823157a/attachment.html>