On 09/17/2013 03:26 AM, james.bellinger at icecube.wisc.edu wrote: > I inherited a system with a wide mix of array sizes (no replication) in > 3.2.2, and wanted to drain data from a failing array. > > I upgraded to 3.3.2, and began a > gluster volume remove-brick scratch "gfs-node01:/sda" start > > After some time I got this: > gluster volume remove-brick scratch "gfs-node01:/sda" status > Node Rebalanced-files size scanned failures > status > --------- ----------- ----------- ----------- ----------- > ------------ > localhost 0 0Bytes 0 0 > not started > gfs-node06 0 0Bytes 0 0 > not started > gfs-node03 0 0Bytes 0 0 > not started > gfs-node05 0 0Bytes 0 0 > not started > gfs-node01 2257394624 2.8TB 5161640 208878 > completed > > Two things jump instantly to mind: > 1) The number of failures is rather large Can you see the rebalance logs (/var/log/scratch-rebalance.log) to figure out what the error messages are? > 2) A _different_ disk seems to have been _partially_ drained. > /dev/sda 2.8T 2.7T 12G 100% /sda > /dev/sdb 2.8T 769G 2.0T 28% /sdb > /dev/sdc 2.8T 2.1T 698G 75% /sdc > /dev/sdd 2.8T 2.2T 589G 79% /sdd > > I know this sounds silly, but just to be sure, is /dev/sda actually mounted on "gfs-node01:sda"? If yes,the files that _were_ successfully rebalanced should have been moved from gfs-node01:sda to one of the other bricks. Is that the case? > When I mount the system it is read-only (another problem I want to fix Again, the mount logs could shed some information .. (btw a successful rebalance start/status sequence should be followed by the rebalance 'commit' command to ensure the volume information gets updated) > ASAP) so I'm pretty sure the failures aren't due to users changing the > system underneath me. > > Thanks for any pointers. > > James Bellinger > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users