Re: Quota Used Value Incorrect - Fix now or after upgrade

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




----- Original Message -----
> From: "Matthew B" <matthew.has.questions@xxxxxxxxx>
> To: "Sanoj Unnikrishnan" <sunnikri@xxxxxxxxxx>
> Cc: "Raghavendra Gowdappa" <rgowdapp@xxxxxxxxxx>, "Gluster Devel" <gluster-devel@xxxxxxxxxxx>
> Sent: Monday, August 28, 2017 9:33:25 PM
> Subject: Re:  Quota Used Value Incorrect - Fix now or after upgrade
> 
> Hi Sanoj,
> 
> Thank you for the information - I have applied the changes you specified
> above - but I haven't seen any changes in the xattrs on the directory after
> about 15 minutes:

I think stat is served from cache - either gluster's md-cache or kernel attribute cache. For healing to happen we need to force a lookup (which we had hoped would be issued as part of stat cmd) and this lookup has to reach marker xlator loaded on bricks. To make sure a lookup on the directory reaches marker we need to:

1. Turn off kernel attribute and entry cache (using --entrytimeout=0 and --attribute-timeout=0 as options to glusterfs while mounting)
2. Turn off md-cache using gluster cli (gluster volume set performance.md-cache <volname> off)
3. Turn off readdirplus in the entire stack [1]

Once the above steps are done I guess doing a stat results in a lookup on the directory witnessed by marker. Once the issue is fixed you can undo the above three steps so that performance is not affected in your setup.

[1] http://nongnu.13855.n7.nabble.com/Turning-off-readdirp-in-the-entire-stack-on-fuse-mount-td220297.html

> 
> [root@gluster07 ~]# setfattr -n trusted.glusterfs.quota.dirty -v 0x3100
> /mnt/raid6-storage/storage/data/projects/MEOPAR/
> 
> [root@gluster07 ~]# stat /mnt/raid6-storage/storage/data/projects/MEOPAR
> 
> [root@gluster07 ~]# getfattr --absolute-names -m . -d -e hex
> /mnt/raid6-storage/storage/data/projects/MEOPAR
> # file: /mnt/raid6-storage/storage/data/projects/MEOPAR
> security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
> trusted.gfid=0x7209b677f4b94d82a3820733620e6929
> trusted.glusterfs.6f95525a-94d7-4174-bac4-e1a18fe010a2.xtime=0x599f228800088654
> trusted.glusterfs.dht=0x0000000100000000b6db6d41db6db6ee
> trusted.glusterfs.quota.d5a5ecda-7511-4bbb-9b4c-4fcc84e3e1da.contri=0xfffffa3d7c1ba60000000000000a9ccb000000000005fd2f
> trusted.glusterfs.quota.dirty=0x3100
> trusted.glusterfs.quota.limit-set=0x0000088000000000ffffffffffffffff
> trusted.glusterfs.quota.size=0xfffffa3d7c1ba60000000000000a9ccb000000000005fd2f
> 
> [root@gluster07 ~]# gluster volume status storage
> Status of volume: storage
> Gluster process                             TCP Port  RDMA Port  Online  Pid
> ------------------------------------------------------------------------------
> Brick 10.0.231.50:/mnt/raid6-storage/storag
> e                                           49159     0          Y
> 2160
> Brick 10.0.231.51:/mnt/raid6-storage/storag
> e                                           49153     0          Y
> 16037
> Brick 10.0.231.52:/mnt/raid6-storage/storag
> e                                           49159     0          Y
> 2298
> Brick 10.0.231.53:/mnt/raid6-storage/storag
> e                                           49154     0          Y
> 9038
> Brick 10.0.231.54:/mnt/raid6-storage/storag
> e                                           49153     0          Y
> 32284
> Brick 10.0.231.55:/mnt/raid6-storage/storag
> e                                           49153     0          Y
> 14840
> Brick 10.0.231.56:/mnt/raid6-storage/storag
> e                                           49152     0          Y
> 29389
> NFS Server on localhost                     2049      0          Y
> 29421
> Quota Daemon on localhost                   N/A       N/A        Y
> 29438
> NFS Server on 10.0.231.51                   2049      0          Y
> 18249
> Quota Daemon on 10.0.231.51                 N/A       N/A        Y
> 18260
> NFS Server on 10.0.231.55                   2049      0          Y
> 24128
> Quota Daemon on 10.0.231.55                 N/A       N/A        Y
> 24147
> NFS Server on 10.0.231.54                   2049      0          Y
> 9397
> Quota Daemon on 10.0.231.54                 N/A       N/A        Y
> 9406
> NFS Server on 10.0.231.53                   2049      0          Y
> 18387
> Quota Daemon on 10.0.231.53                 N/A       N/A        Y
> 18397
> NFS Server on 10.0.231.52                   2049      0          Y
> 2230
> Quota Daemon on 10.0.231.52                 N/A       N/A        Y
> 2262
> NFS Server on 10.0.231.50                   2049      0          Y
> 2113
> Quota Daemon on 10.0.231.50                 N/A       N/A        Y
> 2154
> 
> Task Status of Volume storage
> ------------------------------------------------------------------------------
> There are no active volume tasks
> 
> [root@gluster07 ~]# gluster volume quota storage list | egrep "MEOPAR "
> /data/projects/MEOPAR                      8.5TB     80%(6.8TB) 16384.0PB
> 17.4TB              No                   No
> 
> 
> 
> 
> Looking at the quota daemon on gluster07:
> 
> [root@gluster07 ~]# ps -f -p 29438
> UID        PID  PPID  C STIME TTY          TIME CMD
> root     29438     1  0 Jun19 ?        04:43:31 /usr/sbin/glusterfs -s
> localhost --volfile-id gluster/quotad -p
> /var/lib/glusterd/quotad/run/quotad.pid -l /var/log/glusterfs/quotad.log
> 
> I can see some errors on the log - not sure if those are related:
> 
> [root@gluster07 ~]# tail /var/log/glusterfs/quotad.log
> [2017-08-28 15:36:17.990909] W [dict.c:592:dict_unref]
> (-->/usr/lib64/glusterfs/3.7.13/xlator/features/quotad.so(qd_lookup_cbk+0x35e)
> [0x7f79fb09253e]
> -->/usr/lib64/glusterfs/3.7.13/xlator/features/quotad.so(quotad_aggregator_getlimit_cbk+0xb3)
> [0x7f79fb093333] -->/lib64/libglusterfs.so.0(dict_unref+0x99)
> [0x7f7a090299e9] ) 0-dict: dict is NULL [Invalid argument]
> [2017-08-28 15:36:17.991389] W [dict.c:592:dict_unref]
> (-->/usr/lib64/glusterfs/3.7.13/xlator/features/quotad.so(qd_lookup_cbk+0x35e)
> [0x7f79fb09253e]
> -->/usr/lib64/glusterfs/3.7.13/xlator/features/quotad.so(quotad_aggregator_getlimit_cbk+0xb3)
> [0x7f79fb093333] -->/lib64/libglusterfs.so.0(dict_unref+0x99)
> [0x7f7a090299e9] ) 0-dict: dict is NULL [Invalid argument]
> [2017-08-28 15:36:17.992656] W [dict.c:592:dict_unref]
> (-->/usr/lib64/glusterfs/3.7.13/xlator/features/quotad.so(qd_lookup_cbk+0x35e)
> [0x7f79fb09253e]
> -->/usr/lib64/glusterfs/3.7.13/xlator/features/quotad.so(quotad_aggregator_getlimit_cbk+0xb3)
> [0x7f79fb093333] -->/lib64/libglusterfs.so.0(dict_unref+0x99)
> [0x7f7a090299e9] ) 0-dict: dict is NULL [Invalid argument]
> [2017-08-28 15:36:17.993235] W [dict.c:592:dict_unref]
> (-->/usr/lib64/glusterfs/3.7.13/xlator/features/quotad.so(qd_lookup_cbk+0x35e)
> [0x7f79fb09253e]
> -->/usr/lib64/glusterfs/3.7.13/xlator/features/quotad.so(quotad_aggregator_getlimit_cbk+0xb3)
> [0x7f79fb093333] -->/lib64/libglusterfs.so.0(dict_unref+0x99)
> [0x7f7a090299e9] ) 0-dict: dict is NULL [Invalid argument]
> [2017-08-28 15:45:51.024756] W [dict.c:592:dict_unref]
> (-->/usr/lib64/glusterfs/3.7.13/xlator/features/quotad.so(qd_lookup_cbk+0x35e)
> [0x7f79fb09253e]
> -->/usr/lib64/glusterfs/3.7.13/xlator/features/quotad.so(quotad_aggregator_getlimit_cbk+0xb3)
> [0x7f79fb093333] -->/lib64/libglusterfs.so.0(dict_unref+0x99)
> [0x7f7a090299e9] ) 0-dict: dict is NULL [Invalid argument]
> [2017-08-28 15:45:51.027871] W [dict.c:592:dict_unref]
> (-->/usr/lib64/glusterfs/3.7.13/xlator/features/quotad.so(qd_lookup_cbk+0x35e)
> [0x7f79fb09253e]
> -->/usr/lib64/glusterfs/3.7.13/xlator/features/quotad.so(quotad_aggregator_getlimit_cbk+0xb3)
> [0x7f79fb093333] -->/lib64/libglusterfs.so.0(dict_unref+0x99)
> [0x7f7a090299e9] ) 0-dict: dict is NULL [Invalid argument]
> [2017-08-28 15:45:51.030843] W [dict.c:592:dict_unref]
> (-->/usr/lib64/glusterfs/3.7.13/xlator/features/quotad.so(qd_lookup_cbk+0x35e)
> [0x7f79fb09253e]
> -->/usr/lib64/glusterfs/3.7.13/xlator/features/quotad.so(quotad_aggregator_getlimit_cbk+0xb3)
> [0x7f79fb093333] -->/lib64/libglusterfs.so.0(dict_unref+0x99)
> [0x7f7a090299e9] ) 0-dict: dict is NULL [Invalid argument]
> [2017-08-28 15:45:51.031324] W [dict.c:592:dict_unref]
> (-->/usr/lib64/glusterfs/3.7.13/xlator/features/quotad.so(qd_lookup_cbk+0x35e)
> [0x7f79fb09253e]
> -->/usr/lib64/glusterfs/3.7.13/xlator/features/quotad.so(quotad_aggregator_getlimit_cbk+0xb3)
> [0x7f79fb093333] -->/lib64/libglusterfs.so.0(dict_unref+0x99)
> [0x7f7a090299e9] ) 0-dict: dict is NULL [Invalid argument]
> [2017-08-28 15:45:51.032791] W [dict.c:592:dict_unref]
> (-->/usr/lib64/glusterfs/3.7.13/xlator/features/quotad.so(qd_lookup_cbk+0x35e)
> [0x7f79fb09253e]
> -->/usr/lib64/glusterfs/3.7.13/xlator/features/quotad.so(quotad_aggregator_getlimit_cbk+0xb3)
> [0x7f79fb093333] -->/lib64/libglusterfs.so.0(dict_unref+0x99)
> [0x7f7a090299e9] ) 0-dict: dict is NULL [Invalid argument]
> [2017-08-28 15:45:51.033295] W [dict.c:592:dict_unref]
> (-->/usr/lib64/glusterfs/3.7.13/xlator/features/quotad.so(qd_lookup_cbk+0x35e)
> [0x7f79fb09253e]
> -->/usr/lib64/glusterfs/3.7.13/xlator/features/quotad.so(quotad_aggregator_getlimit_cbk+0xb3)
> [0x7f79fb093333] -->/lib64/libglusterfs.so.0(dict_unref+0x99)
> [0x7f7a090299e9] ) 0-dict: dict is NULL [Invalid argument]
> 
> How should I proceed?
> 
> Thanks,
> -Matthew
> 
> On Mon, Aug 28, 2017 at 3:13 AM, Sanoj Unnikrishnan <sunnikri@xxxxxxxxxx>
> wrote:
> 
> > Hi Mathew,
> >
> > If you are sure that "/mnt/raid6-storage/storage/data/projects/MEOPAR/"
> > is the only directory with wrong accounting and its immediate sub
> > directories have correct xattr values, Setting the dirty xattr and doing a
> > stat after that should resolve the issue.
> >
> > 1) setxattr -n trusted.glusterfs.quota.dirty -v 0x3100
> > /mnt/raid6-storage/storage/data/projects/MEOPAR/
> >
> > 2) stat /mnt/raid6-storage/storage/data/projects/MEOPAR/
> >
> > Could you please share what kind of operations that happens on this
> > directory, to help RCA the issue.
> >
> > If you think this can be true elsewhere in filesystem as well,use the
> > following script to identify the same.
> >
> > 1) https://github.com/gluster/glusterfs/blob/master/extras/
> > quota/xattr_analysis.py
> > 2) https://github.com/gluster/glusterfs/blob/master/extras/
> > quota/log_accounting.sh
> >
> > Regards,
> > Sanoj
> >
> >
> >
> >
> > On Mon, Aug 28, 2017 at 12:39 PM, Raghavendra Gowdappa <
> > rgowdapp@xxxxxxxxxx> wrote:
> >
> >> +sanoj
> >>
> >> ----- Original Message -----
> >> > From: "Matthew B" <matthew.has.questions@xxxxxxxxx>
> >> > To: gluster-devel@xxxxxxxxxxx
> >> > Sent: Saturday, August 26, 2017 12:45:19 AM
> >> > Subject:  Quota Used Value Incorrect - Fix now or after
> >>       upgrade
> >> >
> >> > Hello,
> >> >
> >> > I need some advice on fixing an issue with quota on my gluster volume.
> >> It's
> >> > running version 3.7, distributed volume, with 7 nodes.
> >> >
> >> > # gluster --version
> >> > glusterfs 3.7.13 built on Jul 8 2016 15:26:18
> >> > Repository revision: git:// git.gluster.com/glusterfs.git
> >> > Copyright (c) 2006-2011 Gluster Inc. < http://www.gluster.com >
> >> > GlusterFS comes with ABSOLUTELY NO WARRANTY.
> >> > You may redistribute copies of GlusterFS under the terms of the GNU
> >> General
> >> > Public License.
> >> >
> >> > # gluster volume info storage
> >> >
> >> > Volume Name: storage
> >> > Type: Distribute
> >> > Volume ID: 6f95525a-94d7-4174-bac4-e1a18fe010a2
> >> > Status: Started
> >> > Number of Bricks: 7
> >> > Transport-type: tcp
> >> > Bricks:
> >> > Brick1: 10.0.231.50:/mnt/raid6-storage/storage
> >> > Brick2: 10.0.231.51:/mnt/raid6-storage/storage
> >> > Brick3: 10.0.231.52:/mnt/raid6-storage/storage
> >> > Brick4: 10.0.231.53:/mnt/raid6-storage/storage
> >> > Brick5: 10.0.231.54:/mnt/raid6-storage/storage
> >> > Brick6: 10.0.231.55:/mnt/raid6-storage/storage
> >> > Brick7: 10.0.231.56:/mnt/raid6-storage/storage
> >> > Options Reconfigured:
> >> > changelog.changelog: on
> >> > geo-replication.ignore-pid-check: on
> >> > geo-replication.indexing: on
> >> > nfs.disable: no
> >> > performance.readdir-ahead: on
> >> > features.quota: on
> >> > features.inode-quota: on
> >> > features.quota-deem-statfs: on
> >> > features.read-only: off
> >> >
> >> > # df -h /storage/
> >> > Filesystem Size Used Avail Use% Mounted on
> >> > 10.0.231.50:/storage 255T 172T 83T 68% /storage
> >> >
> >> >
> >> > I am planning to upgrade to 3.10 (or 3.12 when it's available) but I
> >> have a
> >> > number of quotas configured, and one of them (below) has a very wrong
> >> "Used"
> >> > value:
> >> >
> >> > # gluster volume quota storage list | egrep "MEOPAR "
> >> > /data/projects/MEOPAR 8.5TB 80%(6.8TB) 16384.0PB 17.4TB No No
> >> >
> >> >
> >> > I have confirmed the bad value appears in one of the bricks current
> >> xattrs,
> >> > and it looks like the issue has been encountered previously on bricks
> >> 04,
> >> > 03, and 06: (gluster07 does not have a trusted.glusterfs.quota.size.1
> >> as it
> >> > was recently added)
> >> >
> >> > $ ansible -i hosts gluster-servers[0:6] -u <user> --ask-pass -m shell -b
> >> > --become-method=sudo --ask-become-pass -a "getfattr --absolute-names -m
> >> . -d
> >> > -e hex /mnt/raid6-storage/storage/data/projects/MEOPAR | egrep
> >> > '^trusted.glusterfs.quota.size'"
> >> > SSH password:
> >> > SUDO password[defaults to SSH password]:
> >> >
> >> > gluster02 | SUCCESS | rc=0 >>
> >> > trusted.glusterfs.quota.size=0x0000011ecfa56c00000000000005c
> >> d6d000000000006d478
> >> > trusted.glusterfs.quota.size.1=0x0000010ad4a4520000000000000
> >> 12a0300000000000150fa
> >> >
> >> > gluster05 | SUCCESS | rc=0 >>
> >> > trusted.glusterfs.quota.size=0x00000033b8e92200000000000004c
> >> de8000000000006b1a4
> >> > trusted.glusterfs.quota.size.1=0x0000010dca277c0000000000000
> >> 1297d0000000000015005
> >> >
> >> > gluster01 | SUCCESS | rc=0 >>
> >> > trusted.glusterfs.quota.size=0x0000003d4d4348000000000000057
> >> 616000000000006afd2
> >> > trusted.glusterfs.quota.size.1=0x00000133fe211e0000000000000
> >> 5d161000000000006cfd4
> >> >
> >> > gluster04 | SUCCESS | rc=0 >>
> >> > trusted.glusterfs.quota.size=0xffffff396f3e9400000000000004d
> >> 7ea0000000000068c62
> >> > trusted.glusterfs.quota.size.1=0x00000106e672480000000000000
> >> 1138f0000000000012fb2
> >> >
> >> > gluster03 | SUCCESS | rc=0 >>
> >> > trusted.glusterfs.quota.size=0xfffffd02acabf0000000000000035
> >> 99000000000000643e2
> >> > trusted.glusterfs.quota.size.1=0x00000114e20f5e0000000000000
> >> 113b30000000000012fb2
> >> >
> >> > gluster06 | SUCCESS | rc=0 >>
> >> > trusted.glusterfs.quota.size=0xffffff0c98de44000000000000053
> >> 6e40000000000068cf2
> >> > trusted.glusterfs.quota.size.1=0x0000013532664e0000000000000
> >> 5e73f000000000006cfd4
> >> >
> >> > gluster07 | SUCCESS | rc=0 >>
> >> > trusted.glusterfs.quota.size=0xfffffa3d7c1ba60000000000000a9
> >> ccb000000000005fd2f
> >> >
> >> > And reviewing the subdirectories of that folder on the impacted server
> >> you
> >> > can see that none of the direct children have such incorrect values:
> >> >
> >> > [root@gluster07 ~]# getfattr --absolute-names -m . -d -e hex
> >> > /mnt/raid6-storage/storage/data/projects/MEOPAR/*
> >> > # file: /mnt/raid6-storage/storage/data/projects/MEOPAR/<dir1 >
> >> > ...
> >> > trusted.glusterfs.quota.7209b677-f4b9-4d82-a382-0733620e6929
> >> .contri=0x000000fb6841820000000000000074730000000000000dae
> >> > trusted.glusterfs.quota.dirty=0x3000
> >> > trusted.glusterfs.quota.size=0x000000fb684182000000000000007
> >> 4730000000000000dae
> >> >
> >> > # file: /mnt/raid6-storage/storage/data/projects/MEOPAR/<dir2 >
> >> > ...
> >> > trusted.glusterfs.quota.7209b677-f4b9-4d82-a382-0733620e6929
> >> .contri=0x0000000416d5f4000000000000000baa0000000000000441
> >> > trusted.glusterfs.quota.dirty=0x3000
> >> > trusted.glusterfs.quota.limit-set=0x0000010000000000ffffffffffffffff
> >> > trusted.glusterfs.quota.size=0x0000000416d5f4000000000000000
> >> baa0000000000000441
> >> >
> >> > # file: /mnt/raid6-storage/storage/data/projects/MEOPAR/<dir3>
> >> > ...
> >> > trusted.glusterfs.quota.7209b677-f4b9-4d82-a382-0733620e6929
> >> .contri=0x000000110f2c4e00000000000002a76a000000000006ad7d
> >> > trusted.glusterfs.quota.dirty=0x3000
> >> > trusted.glusterfs.quota.limit-set=0x0000020000000000ffffffffffffffff
> >> > trusted.glusterfs.quota.size=0x000000110f2c4e00000000000002a
> >> 76a000000000006ad7d
> >> >
> >> >
> >> > Can I fix this on the current version of gluster (3.7) on just the one
> >> brick
> >> > before I upgrade? Or would I be better off upgrading to 3.10 and trying
> >> to
> >> > fix it then?
> >> >
> >> > I have reviewed information here:
> >> >
> >> > http://lists.gluster.org/pipermail/gluster-devel/2016-Februa
> >> ry/048282.html
> >> > http://lists.gluster.org/pipermail/gluster-users.old/2016-
> >> September/028365.html
> >> >
> >> > It seems like since I am on Gluster 3.7 I can disable quotas and
> >> re-enable
> >> > and everything will get recalculated and increment the index on the
> >> > quota.size xattr. But with the size of the volume that will take a very
> >> long
> >> > time.
> >> >
> >> > Could I simply mark the impacted directly as dirty on gluster07? Or
> >> update
> >> > the xattr directly as the sum of the size of dir1, 2, and 3?
> >> >
> >> > Thanks,
> >> > -Matthew
> >> >
> >> > _______________________________________________
> >> > Gluster-devel mailing list
> >> > Gluster-devel@xxxxxxxxxxx
> >> > http://lists.gluster.org/mailman/listinfo/gluster-devel
> >>
> >
> >
> 
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-devel



[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux