Re: df shows wrong mount size, after adding bricks to volume

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Correct, every brick is a separate xfs-formatted disk attached to the
machine. There are two disks per machine, the ones mounted in `/data2`
are the newer ones.

Thanks for the reassurance, that means we can take as long as
necessary to diagnose this. Let me know if I there's more data I can
provide. lsblk and stat -f outputs follow:

$ ssh imagegluster1 "lsblk"
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0   9.3G  0 disk
└─sda1   8:1    0   9.3G  0 part /
sdb      8:16   0 894.3G  0 disk
└─sdb1   8:17   0 894.3G  0 part /data2
sdc      8:32   0 894.3G  0 disk
└─sdc1   8:33   0 894.3G  0 part /data

$ ssh imagegluster2 "lsblk"
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0   9.3G  0 disk
└─sda1   8:1    0   9.3G  0 part /
sdb      8:16   0 894.3G  0 disk
└─sdb1   8:17   0 894.3G  0 part /data2
sdc      8:32   0 894.3G  0 disk
└─sdc1   8:33   0 894.3G  0 part /data

$ ssh imagegluster3 "lsblk"
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0   9.3G  0 disk
└─sda1   8:1    0   9.3G  0 part /
sdb      8:16   0 894.3G  0 disk
└─sdb1   8:17   0 894.3G  0 part /data2
sdc      8:32   0 894.3G  0 disk
└─sdc1   8:33   0 894.3G  0 part /data

$ ssh imagegluster1 "stat -f /data; stat -f /data2"
  File: "/data"
    ID: 82100000000 Namelen: 255     Type: xfs
Block size: 4096       Fundamental block size: 4096
Blocks: Total: 234307548  Free: 111493566  Available: 111493566
Inodes: Total: 468843968  Free: 459695286
  File: "/data2"
    ID: 81100000000 Namelen: 255     Type: xfs
Block size: 4096       Fundamental block size: 4096
Blocks: Total: 234307553  Free: 111110486  Available: 111110486
Inodes: Total: 468844032  Free: 459769261

$ ssh imagegluster2 "stat -f /data; stat -f /data2"
  File: "/data"
    ID: 82100000000 Namelen: 255     Type: xfs
Block size: 4096       Fundamental block size: 4096
Blocks: Total: 234307548  Free: 111489680  Available: 111489680
Inodes: Total: 468843968  Free: 459695437
  File: "/data2"
    ID: 81100000000 Namelen: 255     Type: xfs
Block size: 4096       Fundamental block size: 4096
Blocks: Total: 234307553  Free: 111110492  Available: 111110492
Inodes: Total: 468844032  Free: 459769261

$ ssh imagegluster3 "stat -f /data; stat -f /data2"
  File: "/data"
    ID: 82100000000 Namelen: 255     Type: xfs
Block size: 4096       Fundamental block size: 4096
Blocks: Total: 234307548  Free: 111495441  Available: 111495441
Inodes: Total: 468843968  Free: 459695437
  File: "/data2"
    ID: 81100000000 Namelen: 255     Type: xfs
Block size: 4096       Fundamental block size: 4096
Blocks: Total: 234307553  Free: 111110505  Available: 111110505
Inodes: Total: 468844032  Free: 459769261

On Fri, May 29, 2020 at 1:10 PM Sanju Rakonde <srakonde@xxxxxxxxxx> wrote:
>
> Hi Petr,
>
> it's absolutely safe to use this volume. you will not see any problems even if the actual used size is greater than the reported total size of the volume and it is safe to upgrade as well.
>
> Can you please share the output of the following:
> l1. sblk output from all the 3 nodes in the cluster
> 2. stat -f <brick-mount> for all the bricks
>
> I hope all the bricks are having a separate filesystem, and it's not shared between any two bricks. Am I correct?
>
> On Fri, May 29, 2020 at 4:25 PM Petr Certik <petr@xxxxxxxxx> wrote:
>>
>> Thanks!
>>
>> One more question -- I don't really mind having the wrong size
>> reported by df, but I'm worried whether it is safe to use the volume.
>> Will it be okay if I write to it? For example, once the actual used
>> size is greater than the reported total size of the volume, should I
>> expect problems? And is it safe to upgrade glusterfs when the volume
>> is in this state?
>>
>> Cheers,
>> Petr
>>
>> On Fri, May 29, 2020 at 11:37 AM Sanju Rakonde <srakonde@xxxxxxxxxx> wrote:
>> >
>> > Nope, for now. I will update you if we figure out any other workaround.
>> >
>> > Thanks for your help!
>> >
>> > On Fri, May 29, 2020 at 2:50 PM Petr Certik <petr@xxxxxxxxx> wrote:
>> >>
>> >> I'm afraid I don't have the resources to try and reproduce from the
>> >> beginning. Is there anything else I can do to get you more
>> >> information?
>> >>
>> >>
>> >> On Fri, May 29, 2020 at 11:08 AM Sanju Rakonde <srakonde@xxxxxxxxxx> wrote:
>> >> >
>> >> > The issue is not with glusterd restart. We need to reproduce from beginning and add-bricks to check df -h values.
>> >> >
>> >> > I suggest not to try on the production environment. if you have any other machines, please let me know.
>> >> >
>> >> > On Fri, May 29, 2020 at 1:37 PM Petr Certik <petr@xxxxxxxxx> wrote:
>> >> >>
>> >> >> If you mean the issue during node restart, then yes, I think I could
>> >> >> reproduce that with a custom build. It's a production system, though,
>> >> >> so I'll need to be extremely careful.
>> >> >>
>> >> >> We're using debian glusterfs-server 7.3-1 amd64, can you provide a
>> >> >> custom glusterd binary based off of that version?
>> >> >>
>> >> >> Cheers,
>> >> >> Petr
>> >> >>
>> >> >> On Fri, May 29, 2020 at 9:09 AM Sanju Rakonde <srakonde@xxxxxxxxxx> wrote:
>> >> >> >
>> >> >> > Surprising! Will you be able to reproduce the issue and share the logs if I provide a custom build with more logs?
>> >> >> >
>> >> >> > On Thu, May 28, 2020 at 1:35 PM Petr Certik <petr@xxxxxxxxx> wrote:
>> >> >> >>
>> >> >> >> Thanks for your help! Much appreciated.
>> >> >> >>
>> >> >> >> The fsid is the same for all bricks:
>> >> >> >>
>> >> >> >> imagegluster1:
>> >> >> >> /var/lib/glusterd/vols/gv0/bricks/imagegluster1:-data2-brick:brick-fsid=2065
>> >> >> >> /var/lib/glusterd/vols/gv0/bricks/imagegluster1:-data-brick:brick-fsid=2065
>> >> >> >> /var/lib/glusterd/vols/gv0/bricks/imagegluster2:-data2-brick:brick-fsid=0
>> >> >> >> /var/lib/glusterd/vols/gv0/bricks/imagegluster2:-data-brick:brick-fsid=0
>> >> >> >> /var/lib/glusterd/vols/gv0/bricks/imagegluster3:-data2-brick:brick-fsid=0
>> >> >> >> /var/lib/glusterd/vols/gv0/bricks/imagegluster3:-data-brick:brick-fsid=0
>> >> >> >>
>> >> >> >> imagegluster2:
>> >> >> >> /var/lib/glusterd/vols/gv0/bricks/imagegluster1:-data2-brick:brick-fsid=0
>> >> >> >> /var/lib/glusterd/vols/gv0/bricks/imagegluster1:-data-brick:brick-fsid=0
>> >> >> >> /var/lib/glusterd/vols/gv0/bricks/imagegluster2:-data2-brick:brick-fsid=2065
>> >> >> >> /var/lib/glusterd/vols/gv0/bricks/imagegluster2:-data-brick:brick-fsid=2065
>> >> >> >> /var/lib/glusterd/vols/gv0/bricks/imagegluster3:-data2-brick:brick-fsid=0
>> >> >> >> /var/lib/glusterd/vols/gv0/bricks/imagegluster3:-data-brick:brick-fsid=0
>> >> >> >>
>> >> >> >> imagegluster3:
>> >> >> >> /var/lib/glusterd/vols/gv0/bricks/imagegluster1:-data2-brick:brick-fsid=0
>> >> >> >> /var/lib/glusterd/vols/gv0/bricks/imagegluster1:-data-brick:brick-fsid=0
>> >> >> >> /var/lib/glusterd/vols/gv0/bricks/imagegluster2:-data2-brick:brick-fsid=0
>> >> >> >> /var/lib/glusterd/vols/gv0/bricks/imagegluster2:-data-brick:brick-fsid=0
>> >> >> >> /var/lib/glusterd/vols/gv0/bricks/imagegluster3:-data2-brick:brick-fsid=2065
>> >> >> >> /var/lib/glusterd/vols/gv0/bricks/imagegluster3:-data-brick:brick-fsid=2065
>> >> >> >>
>> >> >> >>
>> >> >> >> I already did try restarting the glusterd nodes with no effect, but
>> >> >> >> that was before the upgrades of client versions.
>> >> >> >>
>> >> >> >> Running the "volume set" command did not seem to work either, the
>> >> >> >> shared-brick-counts are still the same (2).
>> >> >> >>
>> >> >> >> However, when restarting a node, I do get an error and a few warnings
>> >> >> >> in the log: https://pastebin.com/tqq1FCwZ
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> On Wed, May 27, 2020 at 3:14 PM Sanju Rakonde <srakonde@xxxxxxxxxx> wrote:
>> >> >> >> >
>> >> >> >> > The shared-brick-count value indicates the number of bricks sharing a file-system. In your case, it should be one, as all the bricks are from different mount points. Can you please share the values of brick-fsid?
>> >> >> >> >
>> >> >> >> > grep "brick-fsid" /var/lib/glusterd/vols/<volname>/bricks/
>> >> >> >> >
>> >> >> >> > I tried reproducing this issue in fedora vm's but couldn't hit this. we are seeing this issue on and off but are unable to reproduce in-house. If you see any error messages in glusterd.log please share the log too.
>> >> >> >> >
>> >> >> >> > Work-around to come out from this situation:
>> >> >> >> > 1. Restarting the glusterd service on all nodes:
>> >> >> >> > # systemctl restart glusterd
>> >> >> >> >
>> >> >> >> > 2. Run set volume command to update vol file:
>> >> >> >> > # gluster v set <VOLNAME> min-free-disk 11%
>> >> >> >> >
>> >> >> >> > On Wed, May 27, 2020 at 5:24 PM Petr Certik <petr@xxxxxxxxx> wrote:
>> >> >> >> >>
>> >> >> >> >> As far as I remember, there was no version update on the server. It
>> >> >> >> >> was definitely installed as version 7.
>> >> >> >> >>
>> >> >> >> >> Shared bricks:
>> >> >> >> >>
>> >> >> >> >> Server 1:
>> >> >> >> >>
>> >> >> >> >> /var/lib/glusterd/vols/gv0/gv0.imagegluster1.data2-brick.vol:
>> >> >> >> >> option shared-brick-count 2
>> >> >> >> >> /var/lib/glusterd/vols/gv0/gv0.imagegluster1.data-brick.vol:    option
>> >> >> >> >> shared-brick-count 2
>> >> >> >> >> /var/lib/glusterd/vols/gv0/gv0.imagegluster2.data2-brick.vol:
>> >> >> >> >> option shared-brick-count 0
>> >> >> >> >> /var/lib/glusterd/vols/gv0/gv0.imagegluster2.data-brick.vol:    option
>> >> >> >> >> shared-brick-count 0
>> >> >> >> >> /var/lib/glusterd/vols/gv0/gv0.imagegluster3.data2-brick.vol:
>> >> >> >> >> option shared-brick-count 0
>> >> >> >> >> /var/lib/glusterd/vols/gv0/gv0.imagegluster3.data-brick.vol:    option
>> >> >> >> >> shared-brick-count 0
>> >> >> >> >>
>> >> >> >> >> Server 2:
>> >> >> >> >>
>> >> >> >> >> /var/lib/glusterd/vols/gv0/gv0.imagegluster1.data2-brick.vol:
>> >> >> >> >> option shared-brick-count 0
>> >> >> >> >> /var/lib/glusterd/vols/gv0/gv0.imagegluster1.data-brick.vol:    option
>> >> >> >> >> shared-brick-count 0
>> >> >> >> >> /var/lib/glusterd/vols/gv0/gv0.imagegluster2.data2-brick.vol:
>> >> >> >> >> option shared-brick-count 2
>> >> >> >> >> /var/lib/glusterd/vols/gv0/gv0.imagegluster2.data-brick.vol:    option
>> >> >> >> >> shared-brick-count 2
>> >> >> >> >> /var/lib/glusterd/vols/gv0/gv0.imagegluster3.data2-brick.vol:
>> >> >> >> >> option shared-brick-count 0
>> >> >> >> >> /var/lib/glusterd/vols/gv0/gv0.imagegluster3.data-brick.vol:    option
>> >> >> >> >> shared-brick-count 0
>> >> >> >> >>
>> >> >> >> >> Server 3:
>> >> >> >> >>
>> >> >> >> >> /var/lib/glusterd/vols/gv0/gv0.imagegluster1.data2-brick.vol:
>> >> >> >> >> option shared-brick-count 0
>> >> >> >> >> /var/lib/glusterd/vols/gv0/gv0.imagegluster1.data-brick.vol:    option
>> >> >> >> >> shared-brick-count 0
>> >> >> >> >> /var/lib/glusterd/vols/gv0/gv0.imagegluster2.data2-brick.vol:
>> >> >> >> >> option shared-brick-count 0
>> >> >> >> >> /var/lib/glusterd/vols/gv0/gv0.imagegluster2.data-brick.vol:    option
>> >> >> >> >> shared-brick-count 0
>> >> >> >> >> /var/lib/glusterd/vols/gv0/gv0.imagegluster3.data2-brick.vol:
>> >> >> >> >> option shared-brick-count 2
>> >> >> >> >> /var/lib/glusterd/vols/gv0/gv0.imagegluster3.data-brick.vol:    option
>> >> >> >> >> shared-brick-count 2
>> >> >> >> >>
>> >> >> >> >> On Wed, May 27, 2020 at 1:36 PM Sanju Rakonde <srakonde@xxxxxxxxxx> wrote:
>> >> >> >> >> >
>> >> >> >> >> > Hi Petr,
>> >> >> >> >> >
>> >> >> >> >> > what was the server version before upgrading to 7.2?
>> >> >> >> >> >
>> >> >> >> >> > Can you please share the shared-brick-count values from brick volfiles from all the nodes?
>> >> >> >> >> > grep shared-brick-count /var/lib/glusterd/vols/<volume_name>/*
>> >> >> >> >> >
>> >> >> >> >> > On Wed, May 27, 2020 at 2:31 PM Petr Certik <petr@xxxxxxxxx> wrote:
>> >> >> >> >> >>
>> >> >> >> >> >> Hi everyone,
>> >> >> >> >> >>
>> >> >> >> >> >> we've been running a replicated volume for a while, with three ~1 TB
>> >> >> >> >> >> bricks. Recently we've added three more same-sized bricks, making it a
>> >> >> >> >> >> 2 x 3 distributed replicated volume. However, even after rebalance,
>> >> >> >> >> >> the `df` command on a client shows the correct used/size percentage,
>> >> >> >> >> >> but wrong absolute sizes. The size still shows up as ~1 TB while in
>> >> >> >> >> >> reality it should be around 2 TB, and both "used" and "available"
>> >> >> >> >> >> reported sizes are about half of what they should be. The clients were
>> >> >> >> >> >> an old version (5.5), but even after upgrade to 7.2 and remount, the
>> >> >> >> >> >> reported sizes are still wrong. There are no heal entries. What can I
>> >> >> >> >> >> do to fix this?
>> >> >> >> >> >>
>> >> >> >> >> >> OS: debian buster everywhere
>> >> >> >> >> >> Server version: 7.3-1, opversion: 70200
>> >> >> >> >> >> Client versions: 5.5-3, 7.6-1, opversions: 50400, 70200
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >> root@imagegluster1:~# gluster volume info gv0
>> >> >> >> >> >> Volume Name: gv0
>> >> >> >> >> >> Type: Distributed-Replicate
>> >> >> >> >> >> Volume ID: 5505d350-9b61-4056-9054-de9dfb58eab7
>> >> >> >> >> >> Status: Started
>> >> >> >> >> >> Snapshot Count: 0
>> >> >> >> >> >> Number of Bricks: 2 x 3 = 6
>> >> >> >> >> >> Transport-type: tcp
>> >> >> >> >> >> Bricks:
>> >> >> >> >> >> Brick1: imagegluster1:/data/brick
>> >> >> >> >> >> Brick2: imagegluster2:/data/brick
>> >> >> >> >> >> Brick3: imagegluster3:/data/brick
>> >> >> >> >> >> Brick4: imagegluster1:/data2/brick
>> >> >> >> >> >> Brick5: imagegluster2:/data2/brick
>> >> >> >> >> >> Brick6: imagegluster3:/data2/brick
>> >> >> >> >> >> Options Reconfigured:
>> >> >> >> >> >> features.cache-invalidation: on
>> >> >> >> >> >> transport.address-family: inet
>> >> >> >> >> >> storage.fips-mode-rchecksum: on
>> >> >> >> >> >> nfs.disable: on
>> >> >> >> >> >> performance.client-io-threads: off
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >> root@imagegluster1:~# df -h
>> >> >> >> >> >> Filesystem      Size  Used Avail Use% Mounted on
>> >> >> >> >> >> ...
>> >> >> >> >> >> /dev/sdb1       894G  470G  425G  53% /data2
>> >> >> >> >> >> /dev/sdc1       894G  469G  426G  53% /data
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >> root@any-of-the-clients:~# df -h
>> >> >> >> >> >> Filesystem         Size  Used Avail Use% Mounted on
>> >> >> >> >> >> ...
>> >> >> >> >> >> imagegluster:/gv0  894G  478G  416G  54% /mnt/gluster
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >> Let me know if there's any other info I can provide about our setup.
>> >> >> >> >> >>
>> >> >> >> >> >> Cheers,
>> >> >> >> >> >> Petr Certik
>> >> >> >> >> >> ________
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >> Community Meeting Calendar:
>> >> >> >> >> >>
>> >> >> >> >> >> Schedule -
>> >> >> >> >> >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>> >> >> >> >> >> Bridge: https://bluejeans.com/441850968
>> >> >> >> >> >>
>> >> >> >> >> >> Gluster-users mailing list
>> >> >> >> >> >> Gluster-users@xxxxxxxxxxx
>> >> >> >> >> >> https://lists.gluster.org/mailman/listinfo/gluster-users
>> >> >> >> >> >>
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> > --
>> >> >> >> >> > Thanks,
>> >> >> >> >> > Sanju
>> >> >> >> >>
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > --
>> >> >> >> > Thanks,
>> >> >> >> > Sanju
>> >> >> >>
>> >> >> >
>> >> >> >
>> >> >> > --
>> >> >> > Thanks,
>> >> >> > Sanju
>> >> >>
>> >> >
>> >> >
>> >> > --
>> >> > Thanks,
>> >> > Sanju
>> >>
>> >
>> >
>> > --
>> > Thanks,
>> > Sanju
>>
>
>
> --
> Thanks,
> Sanju
________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users




[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux