remove-brick question

james.bellinger at icecube.wisc.edu (james.bellinger at icecube.wisc.edu) · Wed, 18 Sep 2013 15:34:11 -0500

Thanks for your replies.

My first question is:  can I safely issue a "commit" when the volume does
not seem to have drained?

One of the other arrays failed completely near the end of the draining, so
I'm guaranteed some data loss in any event; I just don't want to put the
system in an unstable state.

I have questions about that too, but I'll save those for another message.

Thanks,
James Bellinger

> On 09/17/2013 03:26 AM, james.bellinger at icecube.wisc.edu wrote:
>> I inherited a system with a wide mix of array sizes (no replication) in
>> 3.2.2, and wanted to drain data from a failing array.
>>
>> I upgraded to 3.3.2, and began a
>> gluster volume remove-brick scratch "gfs-node01:/sda" start
>>
>> After some time I got this:
>> gluster volume remove-brick scratch "gfs-node01:/sda" status
>> Node Rebalanced-files          size       scanned      failures
>> status
>>   ---------      -----------   -----------   -----------   -----------
>> ------------
>> localhost                0        0Bytes             0             0
>> not started
>> gfs-node06                0        0Bytes             0             0
>> not started
>> gfs-node03                0        0Bytes             0             0
>> not started
>> gfs-node05                0        0Bytes             0             0
>> not started
>> gfs-node01       2257394624         2.8TB       5161640        208878
>> completed
>>
>> Two things jump instantly to mind:
>> 1) The number of failures is rather large
> Can you see the rebalance logs (/var/log/scratch-rebalance.log) to
> figure out what the error messages are?
>> 2) A _different_ disk seems to have been _partially_ drained.
>> /dev/sda              2.8T  2.7T   12G 100% /sda
>> /dev/sdb              2.8T  769G  2.0T  28% /sdb
>> /dev/sdc              2.8T  2.1T  698G  75% /sdc
>> /dev/sdd              2.8T  2.2T  589G  79% /sdd
>>
>>
> I know this sounds silly, but just to be sure, is  /dev/sda actually
> mounted on "gfs-node01:sda"?
> If yes,the files that _were_ successfully rebalanced should have been
> moved from gfs-node01:sda to one of the other bricks. Is that the case?
>
>> When I mount the system it is read-only (another problem I want to fix
> Again, the mount logs could shed some information ..
> (btw a successful rebalance start/status sequence should be followed by
> the rebalance 'commit' command to ensure the volume information gets
> updated)
>
>> ASAP) so I'm pretty sure the failures aren't due to users changing the
>> system underneath me.
>>
>> Thanks for any pointers.
>>
>> James Bellinger
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>