Re: how rebalance critical is?

Strahil Nikolov <hunter86_bg@xxxxxxxxx> · Sat, 20 Nov 2021 15:45:12 +0000 (UTC)

Hi Roman,

Gluster doesn't work like that.
Based on the name, each entry (file,dir,symlink,etc) in the volume obtains a gfid. 

The same way both Cluster nodes and clients decide where the entry should exist (on which brick) and if it's not there the node will have to check all bricks from the volume in order to find it.This leads to sub-optimal performance.

Also you risk getting the first brick getting full while having space on the second one.

In order to speed the rebalance (but you need to have enough CPU power for that), you can set the throttling to aggressive:

gluster volume set dw-data rebal-throttle aggressive

The 3 options are "lazy", "normal" and "aggressive" and they are changing the number of files as follows:

lazy -> 1 file at a time
normal (this is the default) -> 2 files at a time / (Number of Logical CPUs - 4)/2 -> whichever is bigger
aggressive -> 4 files at a time / (Number of Logical CPUs - 4)/2 -> whichever is bigger

If your system has a lot of logical CPUs there is no benefit to switching to aggressive,as you are using the maximum that is allowed.

Best Regards,
Strahil Nikolov

В събота, 20 ноември 2021 г., 15:36:57 ч. Гринуич+2, Roman <romeo.r@xxxxxxxxx> написа: 

Hello,

I've added new gluster node to my initial 500TB node. Added another 160TB and... started the rebalance.. it shows this:

 gluster volume rebalance dw-data status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                             dw-gluster1                0        0Bytes            48             0             0          in progress       20:50:31
                               localhost           701180         3.2TB       1170151             0             0          in progress       20:50:31
Estimated time left for rebalance to complete :    59679:10:10
volume rebalance: dw-data: success

which seems too long to complete. There are lots of small files also. So how critical this rebalance function is? What will happen, if i will stop it and do not complete at all? what is the point of rebalnce? I don't mind, if second storage server will be used only when first will be full.
-- 
Best regards,
Roman.

________

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
________

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users