Re: Wrong directory quota usage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi João,

I'd recommend to go with the disable/enable of the quota as that'd eventually do the same thing. Rather than manually changing the parameters in the said command, that would be the better option.

--
Thanks and Regards,

SRIJAN SIVAKUMAR

Associate Software Engineer

Red Hat


T: +91-9727532362    


On Wed, Aug 19, 2020 at 8:12 PM João Baúto <joao.bauto@xxxxxxxxxxxxxxxxxxxxxxx> wrote:
Hi Srijan,

Before I do the disable/enable just want to check something with you. The other cluster where the crawling is running, I can see the find command and this one which seems to be the one triggering the crawler (4 processes, one per brick in all nodes)

/usr/sbin/glusterfs -s localhost --volfile-id client_per_brick/tank.client.hostname.tank-volume1-brick.vol --use-readdirp=yes --client-pid -100 -l /var/log/glusterfs/quota_crawl/tank-volume1-brick.log /var/run/gluster/tmp/mntYbIVwT

Can I manually trigger this command?

Thanks!
João Baúto
---------------
Scientific Computing and Software Platform
Champalimaud Research
Champalimaud Center for the Unknown
Av. Brasília, Doca de Pedrouços
1400-038 Lisbon, Portugal

fchampalimaud.org


Srijan Sivakumar <ssivakum@xxxxxxxxxx> escreveu no dia quarta, 19/08/2020 à(s) 07:25:
Hi João,

If the crawl is not going on and the values are still not reflecting properly then it means the crawl process has ended abruptly.

Yes, technically disabling and enabling the quota will trigger crawl but it'd do a complete crawl of the filesystem, hence would take time and be resource consuming. Usually disabling-enabling is the last thing to do if the accounting isn't reflecting properly but if you're going to merge these two clusters then probably you can go ahead with the merging and then enable quota.

--
Thanks and Regards,

SRIJAN SIVAKUMAR

Associate Software Engineer

Red Hat


T: +91-9727532362    


On Wed, Aug 19, 2020 at 3:53 AM João Baúto <joao.bauto@xxxxxxxxxxxxxxxxxxxxxxx> wrote:
Hi Srijan,

I didn't get any result with that command so I went to our other cluster (we are merging two clusters, data is replicated) and activated the quota feature on the same directory. Running the same command on each node I get a similar output to yours. One process per brick I'm assuming.

root     1746822  1.4  0.0 230324  2992 ?        S    23:06   0:04 /usr/bin/find . -exec /usr/bin/stat {} \ ;
root     1746858  5.3  0.0 233924  6644 ?        S    23:06   0:15 /usr/bin/find . -exec /usr/bin/stat {} \ ;
root     1746889  3.3  0.0 233592  6452 ?        S    23:06   0:10 /usr/bin/find . -exec /usr/bin/stat {} \ ;
root     1746930  3.1  0.0 230476  3232 ?        S    23:06   0:09 /usr/bin/find . -exec /usr/bin/stat {} \ ;

At this point, is it easier to just disable and enable the feature and force a new crawl? We don't mind a temporary increase in CPU and IO usage.

Thank you again!
João Baúto
---------------
Scientific Computing and Software Platform
Champalimaud Research
Champalimaud Center for the Unknown
Av. Brasília, Doca de Pedrouços
1400-038 Lisbon, Portugal

fchampalimaud.org


Srijan Sivakumar <ssivakum@xxxxxxxxxx> escreveu no dia terça, 18/08/2020 à(s) 21:42:
Hi João,

There isn't a straightforward way of tracking the crawl but as gluster uses find and stat during crawl, one can run the following command,
# ps aux | grep find

If the output is of the form,
"root    1513  0.0  0.1  127224  2636  ?        S    12:24    0.00   /usr/bin/find  .  -exec  /usr/bin/stat  {}  \"
then it means that the crawl is still going on.


Thanks and Regards,

SRIJAN SIVAKUMAR

Associate Software Engineer

Red Hat


T: +91-9727532362    



On Wed, Aug 19, 2020 at 1:46 AM João Baúto <joao.bauto@xxxxxxxxxxxxxxxxxxxxxxx> wrote:
Hi Srijan,

Is there a way of getting the status of the crawl process?
We are going to expand this cluster, adding 12 new bricks (around 500TB) and we rely heavily on the quota feature to control the space usage for each project. It's been running since Saturday (nothing changed) and unsure if it's going to finish tomorrow or in weeks.

Thank you!
João Baúto
---------------
Scientific Computing and Software Platform
Champalimaud Research
Champalimaud Center for the Unknown
Av. Brasília, Doca de Pedrouços
1400-038 Lisbon, Portugal

fchampalimaud.org


Srijan Sivakumar <ssivakum@xxxxxxxxxx> escreveu no dia domingo, 16/08/2020 à(s) 06:11:
Hi João,

Yes it'll take some time given the file system size as it has to change the xattrs in each level and then crawl upwards.

stat is done by the script itself so the crawl is initiated.

Regards,
Srijan Sivakumar

On Sun 16 Aug, 2020, 04:58 João Baúto, <joao.bauto@xxxxxxxxxxxxxxxxxxxxxxx> wrote:
Hi Srijan & Strahil,

I ran the quota_fsck script mentioned in Hari's blog post in all bricks and it detected a lot of size mismatch. 

The script was executed as,
  • python quota_fsck.py --sub-dir projectB --fix-issues /mnt/tank /tank/volume2/brick (in all nodes and bricks)
Here is a snippet from the script,

Size Mismatch    /tank/volume2/brick/projectB {'parents': {'00000000-0000-0000-0000-000000000001': {'contri_file_count': 18446744073035296610L, 'contri_size': 18446645297413872640L, 'contri_dir_count': 18446744073709527653L}}, 'version': '1', 'file_count': 18446744073035296610L, 'dirty': False, 'dir_count': 18446744073709527653L, 'size': 18446645297413872640L} 15204281691754
MARKING DIRTY: /tank/volume2/brick/projectB
stat on /mnt/tank/projectB
Files verified : 683223
Directories verified : 46823
Objects Fixed : 705230

Checking the xattr in the bricks I can see the directory in question marked as dirty,
# getfattr -d -m. -e hex /tank/volume2/brick/projectB
getfattr: Removing leading '/' from absolute path names
# file: tank/volume2/brick/projectB
trusted.gfid=0x3ca2bce0455945efa6662813ce20fc0c
trusted.glusterfs.9582685f-07fa-41fd-b9fc-ebab3a6989cf.xtime=0x5f372478000a7705
trusted.glusterfs.dht=0xe1a4060c000000003ffffffe5ffffffc
trusted.glusterfs.mdata=0x010000000000000000000000005f3724750000000013ddf679000000005ce2aff90000000007fdacb0000000005ce2aff90000000007fdacb0
trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri.1=0x00000ca6ccf7a80000000000000790a1000000000000b6ea
trusted.glusterfs.quota.dirty=0x3100
trusted.glusterfs.quota.limit-set.1=0x0000640000000000ffffffffffffffff
trusted.glusterfs.quota.size.1=0x00000ca6ccf7a80000000000000790a1000000000000b6ea

Now, my question is how do I trigger Gluster to recalculate the quota for this directory? Is it automatic but it takes a while? Because the quota list did change but not to a good "result".

Path                   Hard-limit  Soft-limit           Used            Available         Soft-limit exceeded?   Hard-limit exceeded?
/projectB            100.0TB    80%(80.0TB)   16383.9PB   190.1TB           No                               No

I would like to avoid a disable/enable quota in the volume as it removes the configs.

Thank you for all the help!
João Baúto
---------------
Scientific Computing and Software Platform
Champalimaud Research
Champalimaud Center for the Unknown
Av. Brasília, Doca de Pedrouços
1400-038 Lisbon, Portugal

fchampalimaud.org


Srijan Sivakumar <ssivakum@xxxxxxxxxx> escreveu no dia sábado, 15/08/2020 à(s) 11:57:
Hi João,

The quota accounting error is what we're looking at here. I think you've already looked into the blog post by Hari and are using the script to fix the accounting issue.
That should help you out in fixing this issue. 

Let me know if you face any issues while using it.

Regards,
Srijan Sivakumar


On Fri 14 Aug, 2020, 17:10 João Baúto, <joao.bauto@xxxxxxxxxxxxxxxxxxxxxxx> wrote:
Hi Strahil,

I have tried removing the quota for that specific directory and setting it again but it didn't work (maybe it has to be a quota disable and enable in the volume options). Currently testing a solution 
by Hari with the quota_fsck.py script (https://medium.com/@harigowtham/glusterfs-quota-fix-accounting-840df33fcd3a) and its detecting a lot of size mismatch in files.

Thank you,
João Baúto
---------------
Scientific Computing and Software Platform
Champalimaud Research
Champalimaud Center for the Unknown
Av. Brasília, Doca de Pedrouços
1400-038 Lisbon, Portugal

fchampalimaud.org


Strahil Nikolov <hunter86_bg@xxxxxxxxx> escreveu no dia sexta, 14/08/2020 à(s) 10:16:
Hi João,

Based on your output it seems that the quota size is different on the 2 bricks.

Have you tried to remove the quota and then recreate it ? Maybe it will be the easiest way  to fix it.

Best Regards,
Strahil Nikolov


На 14 август 2020 г. 4:35:14 GMT+03:00, "João Baúto" <joao.bauto@xxxxxxxxxxxxxxxxxxxxxxx> написа:
>Hi all,
>
>We have a 4-node distributed cluster with 2 bricks per node running
>Gluster
>7.7 + ZFS. We use directory quota to limit the space used by our
>members on
>each project. Two days ago we noticed inconsistent space used reported
>by
>Gluster in the quota list.
>
>A small snippet of gluster volume quota vol list,
>
> Path                   Hard-limit  Soft-limit          Used
>Available         Soft-limit exceeded?   Hard-limit exceeded?
>/projectA              5.0TB        80%(4.0TB)    3.1TB           1.9TB
>         No                               No
>*/projectB            100.0TB    80%(80.0TB)  16383.4PB   740.9TB
> No                               No*
>/projectC              70.0TB     80%(56.0TB)   50.0TB         20.0TB
>     No                              No
>
>The total space available in the cluster is 360TB, the quota for
>projectB
>is 100TB and, as you can see, its reporting 16383.4PB used and 740TB
>available (already decreased from 750TB).
>
>There was an issue in Gluster 3.x related to the wrong directory quota
>(
>https://lists.gluster.org/pipermail/gluster-users/2016-February/025305.html
> and
>https://lists.gluster.org/pipermail/gluster-users/2018-November/035374.html)
>but it's marked as solved (not sure if the solution still applies).
>
>*On projectB*
># getfattr -d -m . -e hex projectB
># file: projectB
>trusted.gfid=0x3ca2bce0455945efa6662813ce20fc0c
>trusted.glusterfs.9582685f-07fa-41fd-b9fc-ebab3a6989cf.xtime=0x5f35e69800098ed9
>trusted.glusterfs.dht=0xe1a4060c000000003ffffffe5ffffffc
>trusted.glusterfs.mdata=0x010000000000000000000000005f355c59000000000939079f000000005ce2aff90000000007fdacb0000000005ce2aff90000000007fdacb0
>trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri.1=0x0000ab0f227a860000000000478e33acffffffffffffc112
>trusted.glusterfs.quota.dirty=0x3000
>trusted.glusterfs.quota.limit-set.1=0x0000640000000000ffffffffffffffff
>trusted.glusterfs.quota.size.1=0x0000ab0f227a860000000000478e33acffffffffffffc112
>
>*On projectA*
># getfattr -d -m . -e hex projectA
># file: projectA
>trusted.gfid=0x05b09ded19354c0eb544d22d4659582e
>trusted.glusterfs.9582685f-07fa-41fd-b9fc-ebab3a6989cf.xtime=0x5f1aeb9f00044c64
>trusted.glusterfs.dht=0xe1a4060c000000001fffffff3ffffffd
>trusted.glusterfs.mdata=0x010000000000000000000000005f1ac6a10000000018f30a4e000000005c338fab0000000017a3135a000000005b0694fb000000001584a21b
>trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri.1=0x0000067de3bbe20000000000000128610000000000033498
>trusted.glusterfs.quota.dirty=0x3000
>trusted.glusterfs.quota.limit-set.1=0x0000460000000000ffffffffffffffff
>trusted.glusterfs.quota.size.1=0x0000067de3bbe20000000000000128610000000000033498
>
>Any idea on what's happening and how to fix it?
>
>Thanks!
>*João Baúto*
>---------------
>
>*Scientific Computing and Software Platform*
>Champalimaud Research
>Champalimaud Center for the Unknown
>Av. Brasília, Doca de Pedrouços
>1400-038 Lisbon, Portugal
>fchampalimaud.org <https://www.fchampalimaud.org/>
________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users


--




--
Thanks and Regards,

SRIJAN SIVAKUMAR

Associate Software Engineer

Red Hat


T: +91-9727532362    

TRIED. TESTED. TRUSTED.
________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux