remove-brick removed unexpected bricks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks Ravi, I manged to reproduce the issue for 2 times in the past 
several days, but without anything significant in log, volume info and 
after shows correct information (i.e. sdd1 got removed though data was 
not migrated out), rebalance.log telling it was migrating data out of 
sdc1, not sdd1.

I'm doing another try now with -L TRACE to see if I can get more log 
information, this will take some time, will post here if I find anything 
helpful.

-C.B.
On 8/13/2013 6:49 AM, Ravishankar N wrote:
> On 08/13/2013 06:21 PM, Cool wrote:
>> I'm pretty sure I did "watch ... remove-brick ... status" till it 
>> mentioned everything is completed before trigger commit, I should 
>> make it clear in my previous mail.
>>
>> Actually you can read my mail again - in step #5, files on /sdc1 got 
>> migrated instead of /sdd1, even though my command was trying to 
>> remove-brick /sdd1, 
> Ah, my bad. Got it now. This is strange..
>> this is the root cause (to me) that caused the problem, as data on 
>> /sdc1 migrated to /sdb1 and /sdd1, then commit simply remove /sdd1 
>> from gfs_v0. It seems vol definition information got some problem in 
>> gluster.
> If you are able to reproduce the issue, does 'gluster volume info' 
> show the correct bricks before and after start-status-commit 
> operations of removing sdd1? You could also see if there are any error 
> messages in /var/log/glusterfs/<volname>-rebalance.log
>
> -Ravi
>>
>> -C.B.
>>
>> On 8/12/2013 9:51 PM, Ravishankar N wrote:
>>> On 08/13/2013 03:43 AM, Cool wrote:
>>>> remove-brick in 3.4.0 seems removing wrong bricks, can someone help 
>>>> to review the environment/steps to see if I did anything stupid?
>>>>
>>>> setup - Ubuntu 12.04LTS on gfs11 and gfs12, with following packages 
>>>> from ppa, both nodes have 3 xfs partitions sdb1, sdc1, sdd1:
>>>> ii  glusterfs-client 3.4.0final-ubuntu1~precise1 clustered 
>>>> file-system (client package)
>>>> ii  glusterfs-common 3.4.0final-ubuntu1~precise1 GlusterFS common 
>>>> libraries and translator modules
>>>> ii  glusterfs-server 3.4.0final-ubuntu1~precise1 clustered 
>>>> file-system (server package)
>>>>
>>>> step to reproduce the problem:
>>>> 1. create volume gfs_v0 in replica 2 with gfs11:/sdb1 and gfs12:/sdb1
>>>> 2. add-brick gfs11:/sdc1 and gfs12:/sdc1
>>>> 3. add-brick gfs11:/sdd1 and gfs12:/sdd1
>>>> 4. rebalance to make files distributed to all three pair of disks
>>>> 5. remove-brick gfs11:/sdd1 and gfs12:/sdd1 start, files on 
>>>> ***/sdc1*** are migrating out
>>>> 6. remove-brick commit led to data loss in gfs_v0
>>>>
>>>> If between step 5 and 6 I initiate a remove-brick targeting /sdc1, 
>>>> then after commit I would not lose anything since all data will be 
>>>> migrated back to /sdb1.
>>>>
>>>
>>> You should ensure  that a 'remove-brick  start ' has completed and 
>>> then commit it before initiating the second one. The correct way to 
>>> do this would be:
>>> 5.   # gluster volume remove-brick gfs_v0 gfs11:/sdd1 gfs12:/sdd1 start
>>> 6. Check that the data migration has been completed using the status 
>>> command:
>>>       # gluster volume remove-brick gfs_v0 gfs11:/sdd1 gfs12:/sdd1 
>>> status
>>> 7.   #gluster volume remove-brick gfs_v0 gfs11:/sdd1 gfs12:/sdd1 commit
>>> 8.   # gluster volume remove-brick gfs_v0 gfs11:/sdc1 gfs12:/sdc1 start
>>> 9.   # gluster volume remove-brick gfs_v0 gfs11:/sdc1 gfs12:/sdc1 
>>> status
>>> 10. # gluster volume remove-brick gfs_v0 gfs11:/sdc1 gfs12:/sdc1 commit
>>>
>>> This would leave you with the original replica 2 volume that you had 
>>> begun with. Hope this helps.
>>>
>>> Note:
>>> The latest version of glusterfs has the check that prevents a second 
>>> remove-brick operation until the first one has been committed.
>>> (You would receive a message thus : "volume remove-brick start: 
>>> failed: An earlier remove-brick task exists for volume <volname>. 
>>> Either commit it or stop it before starting a new task." )
>>>
>>> -Ravi
>>>
>>>
>>>> -C.B.
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>>
>>
>
>
>



[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux