Thanks Amar. we will try upgrade to 3.2.4 and rebalance it again. On Fri, Oct 21, 2011 at 3:12 PM, Amar Tumballi <amar at gluster.com> wrote: > Thanks for the logs. > > This is due to another issue of 'gfid mismatches', which causes the some > locking deadlock in replicate, which will make each request to fix-layout > take 30mins (in your case i see frame timeout is set to very low value of > 30, which is why you see it many times in 5mins), which explains the > slowness of whole operation. > > Please plan to upgrade to 3.2.4 version, which has most of the fixes > related to gfid mismatch issues. > > Regards, > Amar > > > On Fri, Oct 21, 2011 at 12:23 PM, Changliang Chen <hqucocl at gmail.com>wrote: > >> Any help? >> >> We notice that if the below errors appears,the rebalance fixed layout will >> become very slow,the number just increase about 4 per five minutes. >> >> E [rpc-clnt.c:199:call_bail] dfs-client-0: bailing out frame >> type(GlusterFS 3.1) op(INODELK(29)) xid = 0x755696 sent = 2011-10-20 >> 06:20:51.217782. timeout = 30 >> >> W [afr-self-heal-common.c:584:afr_sh_pending_to_delta] >> afr_sh_pending_to_delta: Unable to get dict value. >> >> I [dht-common.c:369:dht_revalidate_cbk] dfs-dht: subvolume >> 19loudfs-replicate-2 returned -1 (Invalid argument) >> >> >> On Tue, Oct 18, 2011 at 5:45 PM, Changliang Chen <hqucocl at gmail.com>wrote: >> >>> Thanks Amar,but it looks like that the v3.1.1 hasn't support the command >>> >>> 'gluster volume rebalance dfs migrate-data start' >>> >>> # gluster volume rebalance dfs migrate-data start >>> Usage: volume rebalance <VOLNAME> <start|stop|status> >>> Rebalance of Volume dfs failed >>> >>> On Tue, Oct 18, 2011 at 3:33 PM, Amar Tumballi <amar at gluster.com> wrote: >>> >>>> Hi Chen, >>>> >>>> Can you restart the 'glusterd' and run 'gluster volume rebalance dfs >>>> migrate-data start' and check if your data migration happens? >>>> >>>> Regards, >>>> Amar >>>> >>>> On Tue, Oct 18, 2011 at 12:54 PM, Changliang Chen <hqucocl at gmail.com>wrote: >>>> >>>>> Hi guys, >>>>> >>>>> we have a rebalance running on eight bricks since July and this >>>>> is what the status looks like right now: >>>>> >>>>> ===Tue Oct 18 13:45:01 CST 2011 ==== >>>>> rebalance step 1: layout fix in progress: fixed layout 223623 >>>>> >>>>> There are roughly 8T photos in the storage,so how long should this >>>>> rebalance take? >>>>> >>>>> What does the number (in this case) 22362 represent? >>>>> >>>>> Our gluster infomation: >>>>> Repository revision: v3.1.1 >>>>> Volume Name: dfs >>>>> Type: Distributed-Replicate >>>>> Status: Started >>>>> Number of Bricks: 4 x 2 = 8 >>>>> Transport-type: tcp >>>>> Bricks: >>>>> Brick1: 10.1.1.23:/data0 >>>>> Brick2: 10.1.1.24:/data0 >>>>> Brick3: 10.1.1.25:/data0 >>>>> Brick4: 10.1.1.26:/data0 >>>>> Brick5: 10.1.1.27:/data0 >>>>> Brick6: 10.1.1.28:/data0 >>>>> Brick7: 10.1.1.64:/data0 >>>>> Brick8: 10.1.1.65:/data0 >>>>> Options Reconfigured: >>>>> cluster.min-free-disk: 10% >>>>> network.ping-timeout: 25 >>>>> network.frame-timeout: 30 >>>>> performance.cache-max-file-size: 512KB >>>>> performance.cache-size: 3GB >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Regards, >>>>> >>>>> Cocl >>>>> OM manager >>>>> 19lou Operation & Maintenance Dept >>>>> >>>>> _______________________________________________ >>>>> Gluster-users mailing list >>>>> Gluster-users at gluster.org >>>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >>>>> >>>>> >>>> >>> >>> >>> -- >>> >>> Regards, >>> >>> Cocl >>> OM manager >>> 19lou Operation & Maintenance Dept >>> >> >> >> >> -- >> >> Regards, >> >> Cocl >> OM manager >> 19lou Operation & Maintenance Dept >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >> >> > -- Regards, Cocl OM manager 19lou Operation & Maintenance Dept -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://gluster.org/pipermail/gluster-users/attachments/20111026/aa0ab880/attachment-0001.htm>