Re: Rebalancing newly added bricks

Nithya Balachandran <nbalacha@xxxxxxxxxx> · Mon, 9 Sep 2019 09:06:28 +0530

On Sat, 7 Sep 2019 at 00:03, Strahil Nikolov <hunter86_bg@xxxxxxxxx> wrote:

        As it was mentioned, you might have to run rebalance on the other node - but it is better to wait this node is over.

Hi Strahil,

Rebalance does not need to be run on the other node - the operation is a volume wide one . Only a single node per replica set would migrate files in the version used in this case .

Regards,
Nithya

Best Regards,
Strahil Nikolov

                    В петък, 6 септември 2019 г., 15:29:20 ч. Гринуич+3, Herb Burnswell <herbert.burnswell@xxxxxxxxx> написа:

On Thu, Sep 5, 2019 at 9:56 PM Nithya Balachandran <nbalacha@xxxxxxxxxx> wrote:

On Thu, 5 Sep 2019 at 02:41, Herb Burnswell <herbert.burnswell@xxxxxxxxx> wrote:
Thanks for the replies.  The rebalance is running and the brick percentages are not adjusting as expected:
# df -hP |grep data
/dev/mapper/gluster_vg-gluster_lv1_data   60T   49T   11T  83% /gluster_bricks/data1
/dev/mapper/gluster_vg-gluster_lv2_data   60T   49T   11T  83% /gluster_bricks/data2
/dev/mapper/gluster_vg-gluster_lv3_data   60T  4.6T   55T   8% /gluster_bricks/data3
/dev/mapper/gluster_vg-gluster_lv4_data   60T  4.6T   55T   8% /gluster_bricks/data4
/dev/mapper/gluster_vg-gluster_lv5_data   60T  4.6T   55T   8% /gluster_bricks/data5
/dev/mapper/gluster_vg-gluster_lv6_data   60T  4.6T   55T   8% /gluster_bricks/data6

At the current pace it looks like this will continue to run for another 5-6 days.

I appreciate the guidance..

What is the output of the rebalance status command?
Can you check if there are any errors in the rebalance logs on the node  on which you see rebalance activity?
If there are a lot of small files on the volume, the rebalance is expected to take time.

Regards,
Nithya

My apologies, that was a typo.  I meant to say:

"The rebalance is running and the brick percentages are NOW adjusting as expected"

I did expect the rebalance to take several days.  The rebalance log is not showing any errors.  Status output:

# gluster vol rebalance tank status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost          1251320        35.5TB       2079527             0             0          in progress      139:9:46
                               serverB                         0        0Bytes             7             0             0            completed       63:47:55
volume rebalance: tank: success

Thanks again for the guidance.

HB

On Mon, Sep 2, 2019 at 9:08 PM Nithya Balachandran <nbalacha@xxxxxxxxxx> wrote:

On Sat, 31 Aug 2019 at 22:59, Herb Burnswell <herbert.burnswell@xxxxxxxxx> wrote:
Thank you for the reply.
I started a rebalance with force on serverA as suggested.  Now I see 'activity' on that node:

# gluster vol rebalance tank status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost             6143         6.1GB          9542             0             0          in progress        0:4:5
                               serverB                  0        0Bytes             7             0             0          in progress        0:4:5
volume rebalance: tank: success

But I am not seeing any activity on serverB.  Is this expected?  Does the rebalance need to run on each node even though it says both nodes are 'in progress'?

It looks like this is a replicate volume. If that is the case then yes, you are running an old version of Gluster for which this was the default behaviour. 

Regards,
Nithya

Thanks,

HB

On Sat, Aug 31, 2019 at 4:18 AM Strahil <hunter86_bg@xxxxxxxxx> wrote:
The rebalance status show 0 Bytes.
Maybe you should try with the 'gluster volume rebalance <VOLNAME> start force' ?
Best Regards,

Strahil Nikolov
Source: https://docs.gluster.org/en/latest/Administrator%20Guide/Managing%20Volumes/#rebalancing-volumes
On Aug 30, 2019 20:04, Herb Burnswell <herbert.burnswell@xxxxxxxxx> wrote:
All,
RHEL 7.5
Gluster 3.8.15
2 Nodes: serverA & serverB

I am not deeply knowledgeable about Gluster and it's administration but we have a 2 node cluster that's been running for about a year and a half.  All has worked fine to date.  Our main volume has consisted of two 60TB bricks on each of the cluster nodes.  As we reached capacity on the volume we needed to expand.  So, we've added four new 60TB bricks to each of the cluster nodes.  The bricks are now seen, and the total size of the volume is as expected:

# gluster vol status tank
Status of volume: tank
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick serverA:/gluster_bricks/data1       49162     0          Y       20318
Brick serverB:/gluster_bricks/data1       49166     0          Y       3432 
Brick serverA:/gluster_bricks/data2       49163     0          Y       20323
Brick serverB:/gluster_bricks/data2       49167     0          Y       3435 
Brick serverA:/gluster_bricks/data3       49164     0          Y       4625 
Brick serverA:/gluster_bricks/data4       49165     0          Y       4644 
Brick serverA:/gluster_bricks/data5       49166     0          Y       5088 
Brick serverA:/gluster_bricks/data6       49167     0          Y       5128 
Brick serverB:/gluster_bricks/data3       49168     0          Y       22314
Brick serverB:/gluster_bricks/data4       49169     0          Y       22345
Brick serverB:/gluster_bricks/data5       49170     0          Y       22889
Brick serverB:/gluster_bricks/data6       49171     0          Y       22932
Self-heal Daemon on localhost             N/A       N/A        Y       22981
Self-heal Daemon on serverA.example.com   N/A       N/A        Y       6202 

After adding the bricks we ran a rebalance from serverA as:

# gluster volume rebalance tank start

The rebalance completed:

# gluster volume rebalance tank status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes             0             0             0            completed        3:7:10
                             serverA.example.com        0        0Bytes             0             0             0            completed        0:0:0
volume rebalance: tank: success

However, when I run a df, the two original bricks still show all of the consumed space (this is the same on both nodes):

# df -hP
Filesystem                               Size  Used Avail Use% Mounted on
/dev/mapper/vg0-root                     5.0G  625M  4.4G  13% /
devtmpfs                                  32G     0   32G   0% /dev
tmpfs                                     32G     0   32G   0% /dev/shm
tmpfs                                     32G   67M   32G   1% /run
tmpfs                                     32G     0   32G   0% /sys/fs/cgroup
/dev/mapper/vg0-usr                       20G  3.6G   17G  18% /usr
/dev/md126                              1014M  228M  787M  23% /boot
/dev/mapper/vg0-home                     5.0G   37M  5.0G   1% /home
/dev/mapper/vg0-opt                      5.0G   37M  5.0G   1% /opt
/dev/mapper/vg0-tmp                      5.0G   33M  5.0G   1% /tmp
/dev/mapper/vg0-var                       20G  2.6G   18G  13% /var
/dev/mapper/gluster_vg-gluster_lv1_data   60T   59T  1.1T  99% /gluster_bricks/data1
/dev/mapper/gluster_vg-gluster_lv2_data   60T   58T  1.3T  98% /gluster_bricks/data2
/dev/mapper/gluster_vg-gluster_lv3_data   60T  451M   60T   1% /gluster_bricks/data3
/dev/mapper/gluster_vg-gluster_lv4_data   60T  451M   60T   1% /gluster_bricks/data4
/dev/mapper/gluster_vg-gluster_lv5_data   60T  451M   60T   1% /gluster_bricks/data5
/dev/mapper/gluster_vg-gluster_lv6_data   60T  451M   60T   1% /gluster_bricks/data6
localhost:/tank                          355T  116T  239T  33% /mnt/tank

We were thinking that the used space would be distributed across the now 6 bricks after rebalance.  Is that not what a rebalance does?  Is this expected behavior?

Can anyone provide some guidance as to what the behavior here and if there is anything that we need to do at this point?

Thanks in advance,

HB

_______________________________________________

Gluster-users mailing list

Gluster-users@xxxxxxxxxxx

https://lists.gluster.org/mailman/listinfo/gluster-users

_______________________________________________

Gluster-users mailing list

Gluster-users@xxxxxxxxxxx

https://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users