gluster rebalance taking three months

hqucocl at gmail.com (Changliang Chen) · Wed, 26 Oct 2011 09:36:31 +0800

Thanks Amar.
we  will try upgrade to 3.2.4 and rebalance it again.

On Fri, Oct 21, 2011 at 3:12 PM, Amar Tumballi <amar at gluster.com> wrote:

> Thanks for the logs.
>
> This is due to another issue of 'gfid mismatches', which causes the some
> locking deadlock in replicate, which will make each request to fix-layout
> take 30mins (in your case i see frame timeout is set to very low value of
> 30, which is why you see it many times in 5mins), which explains the
> slowness of whole operation.
>
> Please plan to upgrade to 3.2.4 version, which has most of the fixes
> related to gfid mismatch issues.
>
> Regards,
> Amar
>
>
> On Fri, Oct 21, 2011 at 12:23 PM, Changliang Chen <hqucocl at gmail.com>wrote:
>
>> Any help?
>>
>> We notice that if the below errors appears,the rebalance fixed layout will
>> become very slow,the number just increase about 4 per five minutes.
>>
>> E [rpc-clnt.c:199:call_bail] dfs-client-0: bailing out frame
>> type(GlusterFS 3.1) op(INODELK(29)) xid = 0x755696 sent = 2011-10-20
>> 06:20:51.217782. timeout = 30
>>
>>  W [afr-self-heal-common.c:584:afr_sh_pending_to_delta]
>> afr_sh_pending_to_delta: Unable to get dict value.
>>
>> I [dht-common.c:369:dht_revalidate_cbk] dfs-dht: subvolume
>> 19loudfs-replicate-2 returned -1 (Invalid argument)
>>
>>
>> On Tue, Oct 18, 2011 at 5:45 PM, Changliang Chen <hqucocl at gmail.com>wrote:
>>
>>> Thanks Amar,but it looks like that the v3.1.1 hasn't support the command
>>>
>>> 'gluster volume rebalance dfs migrate-data start'
>>>
>>> # gluster volume rebalance dfs migrate-data start
>>> Usage: volume rebalance <VOLNAME> <start|stop|status>
>>> Rebalance of Volume dfs failed
>>>
>>> On Tue, Oct 18, 2011 at 3:33 PM, Amar Tumballi <amar at gluster.com> wrote:
>>>
>>>> Hi Chen,
>>>>
>>>> Can you restart the 'glusterd' and run 'gluster volume rebalance dfs
>>>> migrate-data start' and check if your data migration happens?
>>>>
>>>> Regards,
>>>> Amar
>>>>
>>>>  On Tue, Oct 18, 2011 at 12:54 PM, Changliang Chen <hqucocl at gmail.com>wrote:
>>>>
>>>>> Hi guys,
>>>>>
>>>>>     we have a rebalance running on eight  bricks since  July and this
>>>>> is what the status looks like right now:
>>>>>
>>>>> ===Tue Oct 18 13:45:01 CST 2011 ====
>>>>> rebalance step 1: layout fix in progress: fixed layout 223623
>>>>>
>>>>> There are roughly 8T photos in the storage,so how long should this
>>>>> rebalance take?
>>>>>
>>>>> What does the number (in this case) 22362 represent?
>>>>>
>>>>> Our gluster infomation:
>>>>> Repository revision: v3.1.1
>>>>> Volume Name: dfs
>>>>> Type: Distributed-Replicate
>>>>> Status: Started
>>>>> Number of Bricks: 4 x 2 = 8
>>>>> Transport-type: tcp
>>>>> Bricks:
>>>>> Brick1: 10.1.1.23:/data0
>>>>> Brick2: 10.1.1.24:/data0
>>>>> Brick3: 10.1.1.25:/data0
>>>>> Brick4: 10.1.1.26:/data0
>>>>> Brick5: 10.1.1.27:/data0
>>>>> Brick6: 10.1.1.28:/data0
>>>>> Brick7: 10.1.1.64:/data0
>>>>> Brick8: 10.1.1.65:/data0
>>>>> Options Reconfigured:
>>>>> cluster.min-free-disk: 10%
>>>>> network.ping-timeout: 25
>>>>> network.frame-timeout: 30
>>>>> performance.cache-max-file-size: 512KB
>>>>> performance.cache-size: 3GB
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Regards,
>>>>>
>>>>> Cocl
>>>>> OM manager
>>>>> 19lou Operation & Maintenance Dept
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Regards,
>>>
>>> Cocl
>>> OM manager
>>> 19lou Operation & Maintenance Dept
>>>
>>
>>
>>
>> --
>>
>> Regards,
>>
>> Cocl
>> OM manager
>> 19lou Operation & Maintenance Dept
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>
>>
>

-- 

Regards,

Cocl
OM manager
19lou Operation & Maintenance Dept
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gluster.org/pipermail/gluster-users/attachments/20111026/aa0ab880/attachment-0001.htm>