On 10/18/2011 06:14 AM, Robert Krig wrote: > I think I'm going to have to abandon GlusterFS for our Image files. The > performance is abysmal. I've tried all sorts of settings, but at some > point the http process just keeps spawning more and more processess > because clients are waiting because the directory can't be read, since > glusterfs is busy. > We're not even reaching 500 apache requests per second and already > apache locks up. > > I'm pretty sure it can't be the hardware, since we're talking about a 12 > Core Hyperthreading Xeon CPU, with 48GB of ram and 30TB of storage in a > hardware Raid. From our experience, and please don't take this incorrectly, the vast majority of storage users (and for that matter, storage companies) don't know how to design their RAIDs to their needs. A "fast" CPU (12 core Xeon would be X5650 or higher) won't impact small file read speed all that much. 48 GB of ram could, if you can cache enough of your small files. What you need, for your small random file read, is an SSD or Flash cache. It has to be large enough that its relevant for your use case. I am not sure what your working set size is for your images, but you can buy them from small 300GB units through several 10s of TB. Small random file performance is extremely good, and you can put gluster atop it as a file system if you wish to run the images off the cache ... or you can use it as a block level cache, which you then need to warm up prior to inital use (and then adjust after changes). > I realise that GlusterFS is not ideal for many small files, but this is > beyond ridiculous. It certainly doesn't help that the documentation > doesn't even properly explain how to activate different translators, or > where exactly to edit them by hand in the config files. > > If anyone has any suggestions, I'd be happy to hear them. See above. As noted, most people (and companies) do anywhere from a bad to terrible job on storage system design. No one should be using a large RAID5 or RAID6 for small random file reads. Its simply the wrong design. I am guessing its unlikely that you have a RAID10, but even with that, you are going to be rate limited by the number of drives you have and their about 100 IOP rates. This particular problem isn't likely Gluster's fault. It is likely your storage design. I'd suggest doing a quick test using fio to ascertain how many random read IOPs you can get out of your file system. If you want to handle 500 apache requests per second, how many IOPs does this imply (how many files does each request require to fulfill)? Chances are that you exceed the IOP capacity of your storage by several times. Your best bet is either a caching system, or putting the small randomly accessed image files on SSD or Flash, and using that. Try that before you abandon Gluster. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics, Inc. email: landman at scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/sicluster phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615