gluster rebalance taking three months

mohitanchlia at gmail.com (Mohit Anchlia) · Fri, 30 Mar 2012 17:32:20 -0700

Is there any reason you didn't suggest upgrading to 3.2.6 instead?

On Fri, Oct 21, 2011 at 12:12 AM, Amar Tumballi <amar at gluster.com> wrote:

> Thanks for the logs.
>
> This is due to another issue of 'gfid mismatches', which causes the some
> locking deadlock in replicate, which will make each request to fix-layout
> take 30mins (in your case i see frame timeout is set to very low value of
> 30, which is why you see it many times in 5mins), which explains the
> slowness of whole operation.
>
> Please plan to upgrade to 3.2.4 version, which has most of the fixes
> related to gfid mismatch issues.
>
> Regards,
> Amar
>
>

>
> On Fri, Oct 21, 2011 at 12:23 PM, Changliang Chen <hqucocl at gmail.com>wrote:
>
>> Any help?
>>
>> We notice that if the below errors appears,the rebalance fixed layout
>> will become very slow,the number just increase about 4 per five minutes.
>>
>> E [rpc-clnt.c:199:call_bail] dfs-client-0: bailing out frame
>> type(GlusterFS 3.1) op(INODELK(29)) xid = 0x755696 sent = 2011-10-20
>> 06:20:51.217782. timeout = 30
>>
>>  W [afr-self-heal-common.c:584:afr_sh_pending_to_delta]
>> afr_sh_pending_to_delta: Unable to get dict value.
>>
>> I [dht-common.c:369:dht_revalidate_cbk] dfs-dht: subvolume
>> 19loudfs-replicate-2 returned -1 (Invalid argument)
>>
>>
>> On Tue, Oct 18, 2011 at 5:45 PM, Changliang Chen <hqucocl at gmail.com>wrote:
>>
>>> Thanks Amar,but it looks like that the v3.1.1 hasn't support the command
>>>
>>> 'gluster volume rebalance dfs migrate-data start'
>>>
>>> # gluster volume rebalance dfs migrate-data start
>>> Usage: volume rebalance <VOLNAME> <start|stop|status>
>>> Rebalance of Volume dfs failed
>>>
>>>  On Tue, Oct 18, 2011 at 3:33 PM, Amar Tumballi <amar at gluster.com>wrote:
>>>
>>>> Hi Chen,
>>>>
>>>> Can you restart the 'glusterd' and run 'gluster volume rebalance dfs
>>>> migrate-data start' and check if your data migration happens?
>>>>
>>>> Regards,
>>>> Amar
>>>>
>>>>   On Tue, Oct 18, 2011 at 12:54 PM, Changliang Chen <hqucocl at gmail.com>wrote:
>>>>
>>>>>  Hi guys,
>>>>>
>>>>>     we have a rebalance running on eight  bricks since  July and this
>>>>> is what the status looks like right now:
>>>>>
>>>>> ===Tue Oct 18 13:45:01 CST 2011 ====
>>>>> rebalance step 1: layout fix in progress: fixed layout 223623
>>>>>
>>>>> There are roughly 8T photos in the storage,so how long should this
>>>>> rebalance take?
>>>>>
>>>>> What does the number (in this case) 22362 represent?
>>>>>
>>>>> Our gluster infomation:
>>>>> Repository revision: v3.1.1
>>>>> Volume Name: dfs
>>>>> Type: Distributed-Replicate
>>>>> Status: Started
>>>>> Number of Bricks: 4 x 2 = 8
>>>>> Transport-type: tcp
>>>>> Bricks:
>>>>> Brick1: 10.1.1.23:/data0
>>>>> Brick2: 10.1.1.24:/data0
>>>>> Brick3: 10.1.1.25:/data0
>>>>> Brick4: 10.1.1.26:/data0
>>>>> Brick5: 10.1.1.27:/data0
>>>>> Brick6: 10.1.1.28:/data0
>>>>> Brick7: 10.1.1.64:/data0
>>>>> Brick8: 10.1.1.65:/data0
>>>>> Options Reconfigured:
>>>>> cluster.min-free-disk: 10%
>>>>> network.ping-timeout: 25
>>>>> network.frame-timeout: 30
>>>>> performance.cache-max-file-size: 512KB
>>>>> performance.cache-size: 3GB
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Regards,
>>>>>
>>>>> Cocl
>>>>> OM manager
>>>>> 19lou Operation & Maintenance Dept
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Regards,
>>>
>>> Cocl
>>> OM manager
>>> 19lou Operation & Maintenance Dept
>>>
>>
>>
>>
>> --
>>
>> Regards,
>>
>> Cocl
>> OM manager
>> 19lou Operation & Maintenance Dept
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>
>>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gluster.org/pipermail/gluster-users/attachments/20120330/71e79dd9/attachment.htm>