Hi list, after some rather depressing unsuccessful attempts, I'm wondering if someone has a hint what we could do to accomplish the above task on a productive system: anytime we tried it yet, we had to roll back, since the load of the servers and the clients climbed so high that our application became unusable. Our understanding is, that the introduction of the namespace is killing us, but did not find a way how get around the problem. The setup: 4 servers, each has two bricks and a namespace; the bricks are on separate raid arrays. The client do an afr so that server 1 and 2 mirror each other, as do as 3 and 4. After that, the four resulting afrs are unified (see config below). The setup is working so far, but not very stable (i.e. we see memory leaks on client side). The upgraded version has the four namespaces afr-ed as well. We have about 20 clients connected that only and rarely write, and 7 clients that only but massively read (that is, apache webservers serving the images). All machines are connected through GB Ethernet. May be the source of the problem is, what we store on the cluster: Thats about 12 mio. images, adding to a size of ~300 GB, in a very very nested directory structure. So, lots of relatively small files. And we are about to add another 15 mio. files of even smaller size, they consume only 50 GB in total, most files only 1 or 2 KB in size. Now, if we start the new gluster with a new, empty namespace, it only takes minutes to have the load on the servers to be around 1.5, and on the reading clients to jump as high as 200(!). Obviously, no more images get delivered to connected browers. You can imagine that we did not even remotely thought to add the load of rebuilding the namespace by force, so all the load seems to be coming from self-heal. In an earlier attempt with 1.3.2, this picture didn't change much even after a forced rebuild of the namespace (which took about 24(!)) hours. Also, using only one namespace brick and no afr did help (but it became clear that the server with the namespace was much more loaded than the others). So far, we did not find a proper way to simulate the problems on a test system, which makes it even harder to find a solution :-( One idea that comes to mind is, could we somehow prepare the namespace bricks on the old version cluster, to reduce the necessity of the self-healing mechanism after the upgrade? Thanks for reading this much, I hope I've drawn the picture thoroughly, please let me know if any thing is missing. Cheer, Sascha server config: volume fsbrick1 type storage/posix option directory /data1 end-volume volume fsbrick2 type storage/posix option directory /data2 end-volume volume nsfsbrick1 type storage/posix option directory /data-ns1 end-volume volume brick1 type performance/io-threads option thread-count 8 option queue-limit 1024 subvolumes fsbrick1 end-volume volume brick2 type performance/io-threads option thread-count 8 option queue-limit 1024 subvolumes fsbrick2 end-volume ### Add network serving capability to above bricks. volume server type protocol/server option transport-type tcp/server # For TCP/IP transport option listen-port 6996 # Default is 6996 option client-volume-filename /etc/glusterfs/glusterfs-client.vol subvolumes brick1 brick2 nsfsbrick1 option auth.ip.brick1.allow * # Allow access to "brick" volume option auth.ip.brick2.allow * # Allow access to "brick" volume option auth.ip.nsfsbrick1.allow * # Allow access to "brick" volume end-volume ----------------------------------------------------------------------- client config volume fsc1 type protocol/client option transport-type tcp/client option remote-host 10.10.1.95 option remote-subvolume brick1 end-volume volume fsc1r type protocol/client option transport-type tcp/client option remote-host 10.10.1.95 option remote-subvolume brick2 end-volume volume fsc2 type protocol/client option transport-type tcp/client option remote-host 10.10.1.96 option remote-subvolume brick1 end-volume volume fsc2r type protocol/client option transport-type tcp/client option remote-host 10.10.1.96 option remote-subvolume brick2 end-volume volume fsc3 type protocol/client option transport-type tcp/client option remote-host 10.10.1.97 option remote-subvolume brick1 end-volume volume fsc3r type protocol/client option transport-type tcp/client option remote-host 10.10.1.97 option remote-subvolume brick2 end-volume volume fsc4 type protocol/client option transport-type tcp/client option remote-host 10.10.1.98 option remote-subvolume brick1 end-volume volume fsc4r type protocol/client option transport-type tcp/client option remote-host 10.10.1.98 option remote-subvolume brick2 end-volume volume afr1 type cluster/afr subvolumes fsc1 fsc2r end-volume volume afr2 type cluster/afr subvolumes fsc2 fsc1r end-volume volume afr3 type cluster/afr subvolumes fsc3 fsc4r end-volume volume afr4 type cluster/afr subvolumes fsc4 fsc3r end-volume volume ns1 type protocol/client option transport-type tcp/client option remote-host 10.10.1.95 option remote-subvolume nsfsbrick1 end-volume volume ns2 type protocol/client option transport-type tcp/client option remote-host 10.10.1.96 option remote-subvolume nsfsbrick1 end-volume volume ns3 type protocol/client option transport-type tcp/client option remote-host 10.10.1.97 option remote-subvolume nsfsbrick1 end-volume volume ns4 type protocol/client option transport-type tcp/client option remote-host 10.10.1.98 option remote-subvolume nsfsbrick1 end-volume volume afrns type cluster/afr subvolumes ns1 ns2 ns3 ns4 end-volume volume bricks type cluster/unify subvolumes afr1 afr2 afr3 afr4 option namespace afrns option scheduler alu option alu.limits.min-free-disk 5% option alu.limits.max-open-files 10000 option alu.order disk-usage:read-usage:write-usage:open-files-usage:disk-speed -usage option alu.disk-usage.entry-threshold 2GB option alu.disk-usage.exit-threshold 60MB option alu.open-files-usage.entry-threshold 1024 option alu.open-files-usage.exit-threshold 32 end-volume volume readahead type performance/read-ahead option page-size 256KB option page-count 2 subvolumes bricks end-volume volume write-behind type performance/write-behind option aggregate-size 1MB subvolumes readahead end-volume volume io-cache type performance/io-cache option page-size 128KB option cache-size 64MB subvolumes write-behind end-volume