Re: Rebalance improvement design

Susant Palai <spalai@xxxxxxxxxx> · Wed, 29 Apr 2015 05:34:38 -0400 (EDT)

Hi Ben
   I checked out the glusterfs process attaching gdb and I could not find the newer code. Can you confirm whether you took the new patch ? patch i: http://review.gluster.org/#/c/9657/

Thanks,
Susant

----- Original Message -----
> From: "Susant Palai" <spalai@xxxxxxxxxx>
> To: "Benjamin Turner" <bennyturns@xxxxxxxxx>, "Nithya Balachandran" <nbalacha@xxxxxxxxxx>
> Cc: "Shyamsundar Ranganathan" <srangana@xxxxxxxxxx>
> Sent: Wednesday, April 29, 2015 1:22:02 PM
> Subject: Re:  Rebalance improvement design
> 
> This is how it looks for 2000 file. each 1MB. Done rebalance on 2*2 + 2.
> 
> OLDER:
> [root@gprfs030 ~]# gluster v rebalance test1 status
>                                     Node Rebalanced-files          size
>                                     scanned      failures
>                                     skipped               status   run
>                                     time in secs
>                                ---------      -----------   -----------
>                                -----------   -----------   -----------
>                                ------------     --------------
>                                localhost             2000         1.9GB
>                                3325             0             0
>                                completed              63.00
>                            gprfs032-10ge                0        0Bytes
>                            2158             0             0
>                            completed               6.00
> volume rebalance: test1: success:
> [root@gprfs030 ~]#
> 
> 
> NEW:
> [root@gprfs030 upstream_rebalance]# gluster v rebalance test1 status
>                                     Node Rebalanced-files          size
>                                     scanned      failures
>                                     skipped               status   run
>                                     time in secs
>                                ---------      -----------   -----------
>                                -----------   -----------   -----------
>                                ------------     --------------
>                                localhost             2000         1.9GB
>                                2011             0             0
>                                completed              12.00
>                            gprfs032-10ge                0        0Bytes
>                            0             0             0
>                            failed               0.00 [Failed
>                            because of a crash which I will address in next
>                            patch]
> volume rebalance: test1: success:
> 
> 
> Just trying out replica behaviour for rebalance.
> 
> Here is the volume info.
> [root@gprfs030 ~]# gluster v i
>  
> Volume Name: test1
> Type: Distributed-Replicate
> Volume ID: e12ef289-86f2-454a-beaa-72ea763dbada
> Status: Started
> Number of Bricks: 3 x 2 = 6
> Transport-type: tcp
> Bricks:
> Brick1: gprfs030-10ge:/bricks/gprfs030/brick1
> Brick2: gprfs032-10ge:/bricks/gprfs032/brick1
> Brick3: gprfs030-10ge:/bricks/gprfs030/brick2
> Brick4: gprfs032-10ge:/bricks/gprfs032/brick2
> Brick5: gprfs030-10ge:/bricks/gprfs030/brick3
> Brick6: gprfs032-10ge:/bricks/gprfs032/brick3
> 
> 
> 
> ----- Original Message -----
> > From: "Susant Palai" <spalai@xxxxxxxxxx>
> > To: "Benjamin Turner" <bennyturns@xxxxxxxxx>
> > Cc: "Gluster Devel" <gluster-devel@xxxxxxxxxxx>
> > Sent: Wednesday, April 29, 2015 1:13:04 PM
> > Subject: Re:  Rebalance improvement design
> > 
> > Ben, will you be able to give rebal stat for the same configuration and
> > data
> > set with older rebalance infra ?
> > 
> > Thanks,
> > Susant
> > 
> > ----- Original Message -----
> > > From: "Susant Palai" <spalai@xxxxxxxxxx>
> > > To: "Benjamin Turner" <bennyturns@xxxxxxxxx>
> > > Cc: "Gluster Devel" <gluster-devel@xxxxxxxxxxx>
> > > Sent: Wednesday, April 29, 2015 12:08:38 PM
> > > Subject: Re:  Rebalance improvement design
> > > 
> > > Hi Ben,
> > >   Yes we were using pure dist volume. Will check in to your systems for
> > >   more
> > >   info.
> > > 
> > > Can you please update which patch set you used ? In the mean time I will
> > > do
> > > one set of test with the same configuration on a small data set.
> > > 
> > > Thanks,
> > > Susant
> > > 
> > > 
> > > ----- Original Message -----
> > > > From: "Benjamin Turner" <bennyturns@xxxxxxxxx>
> > > > To: "Nithya Balachandran" <nbalacha@xxxxxxxxxx>
> > > > Cc: "Susant Palai" <spalai@xxxxxxxxxx>, "Gluster Devel"
> > > > <gluster-devel@xxxxxxxxxxx>
> > > > Sent: Wednesday, April 29, 2015 2:13:05 AM
> > > > Subject: Re:  Rebalance improvement design
> > > > 
> > > > I am not seeing the performance you were.  I am running on 500GB of
> > > > data:
> > > > 
> > > > [root@gqas001 ~]# gluster v rebalance testvol status
> > > >                                               Node Rebalanced-files
> > > >  size       scanned      failures       skipped               status
> > > >  run
> > > > time in secs
> > > >                                             ---------      -----------
> > > > -----------   -----------   -----------   -----------
> > > > ------------
> > > >     --------------
> > > >                                         localhost           129021
> > > > 7.9GB        912104             0             0          in progress
> > > >     10100.00
> > > > gqas012.sbu.lab.eng.bos.redhat.com                0        0Bytes
> > > > 1930312             0             0          in progress
> > > > 10100.00
> > > > gqas003.sbu.lab.eng.bos.redhat.com                0        0Bytes
> > > > 1930312             0             0          in progress
> > > > 10100.00
> > > > gqas004.sbu.lab.eng.bos.redhat.com           128903         7.9GB
> > > >  946730             0             0          in progress
> > > >  10100.00
> > > > gqas013.sbu.lab.eng.bos.redhat.com                0        0Bytes
> > > > 1930312             0             0          in progress
> > > > 10100.00
> > > > gqas014.sbu.lab.eng.bos.redhat.com                0        0Bytes
> > > > 1930312             0             0          in progress
> > > > 10100.00
> > > > 
> > > > Based on what I am seeing I expect this to take 2 days.  Was you rebal
> > > > run
> > > > on a pure dist volume?  I am trying on 2x2 + 2 new bricks.  Any idea
> > > > why
> > > > mine is taking so long?
> > > > 
> > > > -b
> > > > 
> > > > 
> > > > 
> > > > On Wed, Apr 22, 2015 at 1:10 AM, Nithya Balachandran
> > > > <nbalacha@xxxxxxxxxx>
> > > > wrote:
> > > > 
> > > > > That sounds great. Thanks.
> > > > >
> > > > > Regards,
> > > > > Nithya
> > > > >
> > > > > ----- Original Message -----
> > > > > From: "Benjamin Turner" <bennyturns@xxxxxxxxx>
> > > > > To: "Nithya Balachandran" <nbalacha@xxxxxxxxxx>
> > > > > Cc: "Susant Palai" <spalai@xxxxxxxxxx>, "Gluster Devel" <
> > > > > gluster-devel@xxxxxxxxxxx>
> > > > > Sent: Wednesday, 22 April, 2015 12:14:14 AM
> > > > > Subject: Re:  Rebalance improvement design
> > > > >
> > > > > I am setting up a test env now, I'll have some feedback for you this
> > > > > week.
> > > > >
> > > > > -b
> > > > >
> > > > > On Tue, Apr 21, 2015 at 11:36 AM, Nithya Balachandran
> > > > > <nbalacha@xxxxxxxxxx
> > > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi Ben,
> > > > > >
> > > > > > Did you get a chance to try this out?
> > > > > >
> > > > > > Regards,
> > > > > > Nithya
> > > > > >
> > > > > > ----- Original Message -----
> > > > > > From: "Susant Palai" <spalai@xxxxxxxxxx>
> > > > > > To: "Benjamin Turner" <bennyturns@xxxxxxxxx>
> > > > > > Cc: "Gluster Devel" <gluster-devel@xxxxxxxxxxx>
> > > > > > Sent: Monday, April 13, 2015 9:55:07 AM
> > > > > > Subject: Re:  Rebalance improvement design
> > > > > >
> > > > > > Hi Ben,
> > > > > >   Uploaded a new patch here: http://review.gluster.org/#/c/9657/.
> > > > > >   We
> > > > > >   can
> > > > > > start perf test on it. :)
> > > > > >
> > > > > > Susant
> > > > > >
> > > > > > ----- Original Message -----
> > > > > > From: "Susant Palai" <spalai@xxxxxxxxxx>
> > > > > > To: "Benjamin Turner" <bennyturns@xxxxxxxxx>
> > > > > > Cc: "Gluster Devel" <gluster-devel@xxxxxxxxxxx>
> > > > > > Sent: Thursday, 9 April, 2015 3:40:09 PM
> > > > > > Subject: Re:  Rebalance improvement design
> > > > > >
> > > > > > Thanks Ben. RPM is not available and I am planning to refresh the
> > > > > > patch
> > > > > in
> > > > > > two days with some more regression fixes. I think we can run the
> > > > > > tests
> > > > > post
> > > > > > that. Any larger data-set will be good(say 3 to 5 TB).
> > > > > >
> > > > > > Thanks,
> > > > > > Susant
> > > > > >
> > > > > > ----- Original Message -----
> > > > > > From: "Benjamin Turner" <bennyturns@xxxxxxxxx>
> > > > > > To: "Vijay Bellur" <vbellur@xxxxxxxxxx>
> > > > > > Cc: "Susant Palai" <spalai@xxxxxxxxxx>, "Gluster Devel" <
> > > > > > gluster-devel@xxxxxxxxxxx>
> > > > > > Sent: Thursday, 9 April, 2015 2:10:30 AM
> > > > > > Subject: Re:  Rebalance improvement design
> > > > > >
> > > > > >
> > > > > > I have some rebalance perf regression stuff I have been working on,
> > > > > > is
> > > > > > there an RPM with these patches anywhere so that I can try it on my
> > > > > > systems? If not I'll just build from:
> > > > > >
> > > > > >
> > > > > > git fetch git:// review.gluster.org/glusterfs
> > > > > > refs/changes/57/9657/8
> > > > > > &&
> > > > > > git cherry-pick FETCH_HEAD
> > > > > >
> > > > > >
> > > > > >
> > > > > > I will have _at_least_ 10TB of storage, how many TBs of data should
> > > > > > I
> > > > > > run
> > > > > > with?
> > > > > >
> > > > > >
> > > > > > -b
> > > > > >
> > > > > >
> > > > > > On Tue, Apr 7, 2015 at 9:07 AM, Vijay Bellur < vbellur@xxxxxxxxxx >
> > > > > wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On 04/07/2015 03:08 PM, Susant Palai wrote:
> > > > > >
> > > > > >
> > > > > > Here is one test performed on a 300GB data set and around 100%(1/2
> > > > > > the
> > > > > > time) improvement was seen.
> > > > > >
> > > > > > [root@gprfs031 ~]# gluster v i
> > > > > >
> > > > > > Volume Name: rbperf
> > > > > > Type: Distribute
> > > > > > Volume ID: 35562662-337e-4923-b862- d0bbb0748003
> > > > > > Status: Started
> > > > > > Number of Bricks: 4
> > > > > > Transport-type: tcp
> > > > > > Bricks:
> > > > > > Brick1: gprfs029-10ge:/bricks/ gprfs029/brick1
> > > > > > Brick2: gprfs030-10ge:/bricks/ gprfs030/brick1
> > > > > > Brick3: gprfs031-10ge:/bricks/ gprfs031/brick1
> > > > > > Brick4: gprfs032-10ge:/bricks/ gprfs032/brick1
> > > > > >
> > > > > >
> > > > > > Added server 32 and started rebalance force.
> > > > > >
> > > > > > Rebalance stat for new changes:
> > > > > > [root@gprfs031 ~]# gluster v rebalance rbperf status
> > > > > > Node Rebalanced-files size scanned failures skipped status run time
> > > > > > in
> > > > > secs
> > > > > > --------- ----------- ----------- ----------- -----------
> > > > > > -----------
> > > > > > ------------ --------------
> > > > > > localhost 74639 36.1GB 297319 0 0 completed 1743.00
> > > > > > 172.17.40.30 67512 33.5GB 269187 0 0 completed 1395.00
> > > > > > gprfs029-10ge 79095 38.8GB 284105 0 0 completed 1559.00
> > > > > > gprfs032-10ge 0 0Bytes 0 0 0 completed 402.00
> > > > > > volume rebalance: rbperf: success:
> > > > > >
> > > > > > Rebalance stat for old model:
> > > > > > [root@gprfs031 ~]# gluster v rebalance rbperf status
> > > > > > Node Rebalanced-files size scanned failures skipped status run time
> > > > > > in
> > > > > secs
> > > > > > --------- ----------- ----------- ----------- -----------
> > > > > > -----------
> > > > > > ------------ --------------
> > > > > > localhost 86493 42.0GB 634302 0 0 completed 3329.00
> > > > > > gprfs029-10ge 94115 46.2GB 687852 0 0 completed 3328.00
> > > > > > gprfs030-10ge 74314 35.9GB 651943 0 0 completed 3072.00
> > > > > > gprfs032-10ge 0 0Bytes 594166 0 0 completed 1943.00
> > > > > > volume rebalance: rbperf: success:
> > > > > >
> > > > > >
> > > > > > This is interesting. Thanks for sharing & well done! Maybe we
> > > > > > should
> > > > > > attempt a much larger data set and see how we fare there :).
> > > > > >
> > > > > > Regards,
> > > > > >
> > > > > >
> > > > > > Vijay
> > > > > >
> > > > > >
> > > > > > ______________________________ _________________
> > > > > > Gluster-devel mailing list
> > > > > > Gluster-devel@xxxxxxxxxxx
> > > > > > http://www.gluster.org/ mailman/listinfo/gluster-devel
> > > > > >
> > > > > > _______________________________________________
> > > > > > Gluster-devel mailing list
> > > > > > Gluster-devel@xxxxxxxxxxx
> > > > > > http://www.gluster.org/mailman/listinfo/gluster-devel
> > > > > > _______________________________________________
> > > > > > Gluster-devel mailing list
> > > > > > Gluster-devel@xxxxxxxxxxx
> > > > > > http://www.gluster.org/mailman/listinfo/gluster-devel
> > > > > >
> > > > >
> > > > 
> > > _______________________________________________
> > > Gluster-devel mailing list
> > > Gluster-devel@xxxxxxxxxxx
> > > http://www.gluster.org/mailman/listinfo/gluster-devel
> > > 
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel@xxxxxxxxxxx
> > http://www.gluster.org/mailman/listinfo/gluster-devel
> > 
> 
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel