remove-brick question

ravishankar at redhat.com (Ravishankar N) · Tue, 17 Sep 2013 11:18:53 +0530



On 09/17/2013 03:26 AM, james.bellinger at icecube.wisc.edu wrote:
> I inherited a system with a wide mix of array sizes (no replication) in
> 3.2.2, and wanted to drain data from a failing array.
>
> I upgraded to 3.3.2, and began a
> gluster volume remove-brick scratch "gfs-node01:/sda" start
>
> After some time I got this:
> gluster volume remove-brick scratch "gfs-node01:/sda" status
> Node Rebalanced-files          size       scanned      failures
> status
>   ---------      -----------   -----------   -----------   -----------
> ------------
> localhost                0        0Bytes             0             0
> not started
> gfs-node06                0        0Bytes             0             0
> not started
> gfs-node03                0        0Bytes             0             0
> not started
> gfs-node05                0        0Bytes             0             0
> not started
> gfs-node01       2257394624         2.8TB       5161640        208878
> completed
>
> Two things jump instantly to mind:
> 1) The number of failures is rather large
Can you see the rebalance logs (/var/log/scratch-rebalance.log) to 
figure out what the error messages are?
> 2) A _different_ disk seems to have been _partially_ drained.
> /dev/sda              2.8T  2.7T   12G 100% /sda
> /dev/sdb              2.8T  769G  2.0T  28% /sdb
> /dev/sdc              2.8T  2.1T  698G  75% /sdc
> /dev/sdd              2.8T  2.2T  589G  79% /sdd
>
>
I know this sounds silly, but just to be sure, is  /dev/sda actually 
mounted on "gfs-node01:sda"?
If yes,the files that _were_ successfully rebalanced should have been 
moved from gfs-node01:sda to one of the other bricks. Is that the case?

> When I mount the system it is read-only (another problem I want to fix
Again, the mount logs could shed some information ..
(btw a successful rebalance start/status sequence should be followed by 
the rebalance 'commit' command to ensure the volume information gets 
updated)

> ASAP) so I'm pretty sure the failures aren't due to users changing the
> system underneath me.
>
> Thanks for any pointers.
>
> James Bellinger
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users