Fwd: Feature: Rebalance completion time estimation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 14 November 2016 at 05:10, Shyam <srangana@xxxxxxxxxx> wrote:
On 11/11/2016 05:46 AM, Susant Palai wrote:
Hello All,
   We have been receiving many requests from users to give a "Rebalance  completion time estimation". This email is to gather ideas and feedback from the community for the same. We have one proposal, but nothing is concrete. Please feel free to give your input for this problem.

A brief about rebalance operation:
- Rebalance process is used to rebalance data across cluster most likely in the event of add-brick and remove-brick. Rebalance is spawned on each node. The job for the process is to read directories, fix it's layout to include the newly added brick. Read children files(only those reside on local bricks) of the directory and migrate them if necessary decided by the new layout.


Here is one of the solution pitched by Manoj Pillai.

Assumptions for this idea:
 - files are of similar size.
 - Max 40% of the total files will be migrated

1- Do a statfs on the local bricks. Say the total size is St.

Why not use the f_files from statfs that shows inode count and use that and possibly f_ffree, to determine how many inodes are there, and then use the crawl, to figure out how many we have visited and how many are pending to determine rebalance progress.

I am not sure if the local FS (XFS say) fills up this data for use, but if it does, then it may provide a better estimate.



Thanks Shyam, that is a good idea. 
I tried out  a very rough version of this. The statfs does return the inode info (available and used) on my XFS brick. However those numbers are thrown way off by the entries in the .glusterfs directory.  In my very limited file only dataset, there were almost twice as many inodes in use as there were files in the volume.  I am yet to try out the results with a directory heavy data set.


High level algorithm:

1. When rebalance starts up, get the estimated number of files on the brick using the statfs inode count.
2. As rebalance proceeds, calculate the rate at which files are being looked up. This is based on the assumption that a rebalance cannot complete until the filesystem crawl is complete. Actual file migration operations do not seem to contribute greatly to this time but that still needs to be validated with more realistic data sets.
3. Using the calculated rate and the estimated number of files, calculate the time it would take to process all the files on the brick.  That would be our estimate for how long rebalance would take to complete on that node.

Things to be considered/assumptions:

1. A single filesystem partition contains a single brick in order for the statfs info to be valid
2. My test was run on a single brick volume to which I added another brick and started rebalance. More nodes and bricks in the cluster would mean that the total number of files might change more frequently as files are not just moved off the brick but to it as well.

That being said, the initial results are encouraging. The estimates generated were fairly close to the times actually taken. The estimates are generated every time the 
gluster v rebalance <vol> status

command is run and the values autocorrect to take the latest data into consideration. However, mine was a limited setup and most rebalance runs took around 10 mins or so. It would be interesting to see the numbers for larger data sets where rebalance takes days or weeks.

Regards,
Nithya








_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel


_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux