On 21/05/13 22:45, Joseph Santaniello wrote: > Hello All, > > I am exploring options for deploying a Gluster system, and one > possible scenario we are contemplating would involve potentially > thousands (1-2000) of volumes with correspondingly thousands of > mounts. > > Are there any intrinsic reason why this would be a bad idea with Gluster? Two thoughts occur to me - firstly, memory consumption: Gluster spawns a process for every volume on the servers and for every mount on the client. So you'd end up with a lot of glusterfs processes running on each machine. That's a lot of context switching for the kernel to do, and they're going to use a non-negligible amount of memory. I'm not actually sure what the real-world memory requirement per process is.. On a couple of machines I just checked, it looks like between 15-30M (VmRSS-VmLib), but your mileage my vary. If your memory use per-gluster-process is just 24M, that's still 48G of ram required to launch a couple of thousand of them. If it turns out they need more like 128M each, that's quarter of a terabyte of memory required per machine. The second thing that worries me is that gluster's recovery mechanism doesn't have anything to prevent simultaneous recovery across all the volumes on a node. As a result, as soon as a bad node rejoins the cluster, all your 2000 volumes will simultaneously start rebuilding, causing massive random i/o load, and all your clients will starve. That happens to me even with just a couple of dozen volumes, so I hate to think how it'd go with thousands! -Toby