Re: df does not show full volume capacity after update to 3.12.4

Amar Tumballi <atumball@xxxxxxxxxx> · Thu, 1 Feb 2018 09:49:59 +0530

On Thu, Feb 1, 2018 at 9:31 AM, Nithya Balachandran <nbalacha@xxxxxxxxxx> wrote:
Hi,

I think we have a workaround for until we have a fix in the code. The following worked on my system.
Copy the attached file to /usr/lib*/glusterfs/3.12.4/filter/. (You might need to create the filter directory in this path.)
Make sure the file has execute permissions. On my system:

[root@rhgsserver1 fuse2]# cd /usr/lib/glusterfs/3.12.5/
[root@rhgsserver1 3.12.5]# l
total 4.0K
drwxr-xr-x.  2 root root   64 Feb  1 08:56 auth
drwxr-xr-x.  2 root root   34 Feb  1 09:12 filter
drwxr-xr-x.  2 root root   66 Feb  1 08:55 rpc-transport
drwxr-xr-x. 13 root root 4.0K Feb  1 08:57 xlator

[root@rhgsserver1 fuse2]# cd filter
[root@rhgsserver1 filter]# pwd
/usr/lib/glusterfs/3.12.5/filter
[root@rhgsserver1 filter]# ll
total 4
-rwxr-xr-x. 1 root root 95 Feb  1 09:12 shared-brick-count.sh

Rerun:
gluster v set dataeng cluster.min-free-inodes 6%

Wow! I like this approach :-) Awesome! Thanks Nithya.

-Amar

Check the .vol files to see if the value has changed. It should now be 1.You do not need to restart the volume.

See [1] for more details.

Regards,
Nithya

[1] http://docs.gluster.org/en/latest/Administrator%20Guide/GlusterFS%20Filter/

On 31 January 2018 at 23:39, Freer, Eva B. <freereb@xxxxxxxx> wrote:

Amar,

Thanks for your prompt reply. No, I do not plan to fix the code and re-compile. I was hoping it could be fixed with setting the shared-brick-count or some other
 option. Since this is a production system, we will wait until a fix is in a release.

Thanks,

Eva     (865) 574-6894

From:
Amar Tumballi <atumball@xxxxxxxxxx>

Date: Wednesday, January 31, 2018 at 12:15 PM

To: Eva Freer <freereb@xxxxxxxx>

Cc: Nithya Balachandran <nbalacha@xxxxxxxxxx>, "Greene, Tami McFarlin" <greenet@xxxxxxxx>, "gluster-users@xxxxxxxxxxx" <gluster-users@xxxxxxxxxxx>

Subject: Re:  df does not show full volume capacity after update to 3.12.4

Hi Freer, 

Our analysis is that this issue is caused by https://review.gluster.org/17618. Specifically, in 'gd_set_shared_brick_count()' from https://review.gluster.org/#/c/17618/9/xlators/mgmt/glusterd/src/glusterd-utils.c.

But even if we fix it today, I don't think we have a release planned immediately for shipping this. Are you planning to fix the code and re-compile?

Regards,

Amar

On Wed, Jan 31, 2018 at 10:00 PM, Freer, Eva B. <freereb@xxxxxxxx> wrote:

Nithya,

I will be out of the office for ~10 days starting tomorrow. Is there any way we could possibly resolve it today?

Thanks,

Eva     (865) 574-6894

From: Nithya Balachandran <nbalacha@xxxxxxxxxx>

Date: Wednesday, January 31, 2018 at 11:26 AM

To: Eva Freer <freereb@xxxxxxxx>

Cc: "Greene, Tami McFarlin" <greenet@xxxxxxxx>, "gluster-users@xxxxxxxxxxx" <gluster-users@xxxxxxxxxxx>,
 Amar Tumballi <atumball@xxxxxxxxxx>

Subject: Re:  df does not show full volume capacity after update to 3.12.4

On 31 January 2018 at 21:50, Freer, Eva B. <freereb@xxxxxxxx> wrote:

The values for shared-brick-count are still the same. I did not re-start the volume after setting the cluster.min-free-inodes to 6%. Do I need to restart it?

That is not necessary. Let me get back to you on this tomorrow.

Regards,

Nithya

Thanks,

Eva     (865) 574-6894

From: Nithya Balachandran <nbalacha@xxxxxxxxxx>

Date: Wednesday, January 31, 2018 at 11:14 AM

To: Eva Freer <freereb@xxxxxxxx>

Cc: "Greene, Tami McFarlin" <greenet@xxxxxxxx>, "gluster-users@xxxxxxxxxxx" <gluster-users@xxxxxxxxxxx>,
 Amar Tumballi <atumball@xxxxxxxxxx>

Subject: Re:  df does not show full volume capacity after update to 3.12.4

On 31 January 2018 at 21:34, Freer, Eva B. <freereb@xxxxxxxx> wrote:

Nithya,

Responding to an earlier question: Before the upgrade, we were at 3.103 on these servers, but some of the clients were 3.7.6. From below, does this mean that “shared-brick-count” needs
 to be set to 1 for all bricks.

All of the bricks are on separate xfs partitions composed hardware of RAID 6 volumes. LVM is not used. The current setting for cluster.min-free-inodes was 5%. I changed it to 6% per
 your instructions below. The df output is still the same, but I haven’t done the

find /var/lib/glusterd/vols -type f|xargs sed -i -e 's/option shared-brick-count [0-9]*/option shared-brick-count 1/g'

Should I go ahead and do this?

Can you check if the values have been changed in the .vol files before you try this? 

These files will be regenerated every time the volume is changed so changing them directly may not be permanent. I was hoping setting the cluster.min-free-inodes would have corrected this automatically and helped us figure out what was happening as we have
 not managed to reproduce this issue yet. 

Output of stat –f for all the bricks:

[root@jacen ~]# stat -f /bricks/data_A*

  File: "/bricks/data_A1"

    ID: 80100000000 Namelen: 255     Type: xfs

Block size: 4096       Fundamental block size: 4096

Blocks: Total: 15626471424 Free: 4530515093 Available: 4530515093

Inodes: Total: 1250159424 Free: 1250028064

  File: "/bricks/data_A2"

    ID: 81100000000 Namelen: 255     Type: xfs

Block size: 4096       Fundamental block size: 4096

Blocks: Total: 15626471424 Free: 3653183901 Available: 3653183901

Inodes: Total: 1250159424 Free: 1250029262

  File: "/bricks/data_A3"

    ID: 82100000000 Namelen: 255     Type: xfs

Block size: 4096       Fundamental block size: 4096

Blocks: Total: 15626471424 Free: 15134840607 Available: 15134840607

Inodes: Total: 1250159424 Free: 1250128031

  File: "/bricks/data_A4"

    ID: 83100000000 Namelen: 255     Type: xfs

Block size: 4096       Fundamental block size: 4096

Blocks: Total: 15626471424 Free: 15626461604 Available: 15626461604

Inodes: Total: 1250159424 Free: 1250153857

[root@jaina dataeng]# stat -f /bricks/data_B*

  File: "/bricks/data_B1"

    ID: 80100000000 Namelen: 255     Type: xfs

Block size: 4096       Fundamental block size: 4096

Blocks: Total: 15626471424 Free: 5689640723 Available: 5689640723

Inodes: Total: 1250159424 Free: 1250047934

  File: "/bricks/data_B2"

    ID: 81100000000 Namelen: 255     Type: xfs

Block size: 4096       Fundamental block size: 4096

Blocks: Total: 15626471424 Free: 6623312785 Available: 6623312785

Inodes: Total: 1250159424 Free: 1250048131

  File: "/bricks/data_B3"

    ID: 82100000000 Namelen: 255     Type: xfs

Block size: 4096       Fundamental block size: 4096

Blocks: Total: 15626471424 Free: 15106888485 Available: 15106888485

Inodes: Total: 1250159424 Free: 1250122139

  File: "/bricks/data_B4"

    ID: 83100000000 Namelen: 255     Type: xfs

Block size: 4096       Fundamental block size: 4096

Blocks: Total: 15626471424 Free: 15626461604 Available: 15626461604

Inodes: Total: 1250159424 Free: 1250153857

Thanks,

Eva     (865) 574-6894

From: Nithya Balachandran <nbalacha@xxxxxxxxxx>

Date: Wednesday, January 31, 2018 at 10:46 AM

To: Eva Freer <freereb@xxxxxxxx>, "Greene, Tami McFarlin" <greenet@xxxxxxxx>

Cc: Amar Tumballi <atumball@xxxxxxxxxx>

Subject: Re:  df does not show full volume capacity after update to 3.12.4

Thank you Eva. 

>From the files you sent:

dataeng.jacen.bricks-data_A1-dataeng.vol:    option shared-brick-count 2

dataeng.jacen.bricks-data_A2-dataeng.vol:    option shared-brick-count 2

dataeng.jacen.bricks-data_A3-dataeng.vol:    option shared-brick-count 1

dataeng.jacen.bricks-data_A4-dataeng.vol:    option shared-brick-count 1

dataeng.jaina.bricks-data_B1-dataeng.vol:    option shared-brick-count 0

dataeng.jaina.bricks-data_B2-dataeng.vol:    option shared-brick-count 0

dataeng.jaina.bricks-data_B3-dataeng.vol:    option shared-brick-count 0

dataeng.jaina.bricks-data_B4-dataeng.vol:    option shared-brick-count 0

Are all of these bricks on separate Filesystem partitions? If yes, can you please try running the following on one of the gluster nodes and see if the df output works post that?

gluster v set dataeng cluster.min-free-inodes 6%

If it doesn;t work, please send us the stat -f output for each brick.

Regards,

Nithya

On 31 January 2018 at 20:41, Freer, Eva B. <freereb@xxxxxxxx> wrote:

Nithya,

The file for one of the servers is attached.

Thanks,

Eva     (865) 574-6894

From: Nithya Balachandran <nbalacha@xxxxxxxxxx>

Date: Wednesday, January 31, 2018 at 1:17 AM

To: Eva Freer <freereb@xxxxxxxx>

Cc: "gluster-users@xxxxxxxxxxx" <gluster-users@xxxxxxxxxxx>, "Greene, Tami McFarlin" <greenet@xxxxxxxx>

Subject: Re:  df does not show full volume capacity after update to 3.12.4

I found this on the mailing list: 

I found the issue.

The CentOS 7 RPMs, upon upgrade, modifies the .vol files. Among other things, it adds "option shared-brick-count \d", using the number of bricks in the volume.

This gives you an average free space per brick, instead of total free space in the volume.

When I create a new volume, the value of "shared-brick-count" is "1".

find /var/lib/glusterd/vols -type f|xargs sed -i -e 's/option shared-brick-count [0-9]*/option shared-brick-count 1/g'

Eva, can you send me the contents of the /var/lib/glusterd/<volname> folder from any one node so I can confirm if this is the problem?

Regards,

Nithya

On 31 January 2018 at 10:47, Nithya Balachandran <nbalacha@xxxxxxxxxx> wrote:

Hi Eva, 

One more question. What version of gluster were you running before the upgrade?

Thanks,

Nithya

On 31 January 2018 at 09:52, Nithya Balachandran <nbalacha@xxxxxxxxxx> wrote:

Hi Eva, 

Can you send us the following:

gluster volume info

gluster volume status

The log files and tcpdump for df on a fresh mount point for that volume.

Thanks,

Nithya

On 31 January 2018 at 07:17, Freer, Eva B. <freereb@xxxxxxxx> wrote:

After OS update to CentOS 7.4 or RedHat 6.9 and update to Gluster 3.12.4, the ‘df’ command shows only part of the available space on the mount point for multi-brick volumes. All nodes are at 3.12.4.
 This occurs on both servers and clients. 

We have 2 different server configurations.

Configuration 1: A distributed volume of 8 bricks with 4 on each server. The initial configuration had 4 bricks of 59TB each with 2 on each server. Prior to the update to CentOS 7.4 and gluster 3.12.4,
 ‘df’ correctly showed the size for the volume as 233TB. After the update, we added 2 bricks with 1 on each server, but the output of ‘df’ still only listed 233TB for the volume. We added 2 more bricks, again with 1 on each server. The output of ‘df’ now shows
 350TB, but the aggregate of 8 – 59TB bricks should be ~466TB.

Configuration 2: A distributed, replicated volume with 9 bricks on each server for a total of ~350TB of storage. After the server update to RHEL 6.9 and gluster 3.12.4, the volume now shows as having
 50TB with ‘df’. No changes were made to this volume after the update.

In both cases, examining the bricks shows that the space and files are still there, just not reported correctly with ‘df’. All machines have been rebooted and the problem persists.

Any help/advice you can give on this would be greatly appreciated.

Thanks in advance.

Eva Freer

_______________________________________________

Gluster-users mailing list

Gluster-users@xxxxxxxxxxx

http://lists.gluster.org/mailman/listinfo/gluster-users

-- 

Amar Tumballi (amarts)

-- 
Amar Tumballi (amarts)

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users