NFS cosmetic quota exceeded bug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi

Anyone else experienced this? Spent quite a bit of time googling, can’t
find a similar issue. Also unsure if this is something that also occurs
in Centos 8.

Occurs on Rocky 8.6 (kernel 4.18.0) and 9.0 (kernel 5.14.0) and
apparently Ubuntu 20-04 although another team tested this last OS for us.

Tested from a NFS4.1 mount from a Pure Storage device as well as a
NFS4.2 mount from a xfs volume running on a Rocky9 server.
The user name spaces (uids) were identical for all tests.


Summary:
NFS client continues to issue “Disk quota exceeded” errors after quota
is raised. This is only for block quotas, not inode quotas. It appears
to be related to client side attribute caching.


Description:
NFS file system mounted on host on which client is working.
Client is overquota and tries to write to a file (call this FileA.txt).
Client gets “Disk quota exceeded” error as expected.

Admin now increases the quota sufficiently to allow the user to continue
writing to FileA.txt. However writes to this particular file still
produce “Disk quota exceeded” errors, even though client successfully
writes to the file. Writes to other files do not produce errors so long
as client did not attempt to write to them while quota was exceeded.
Writes to FileA.txt on other hosts which have the NFS file system
mounted do not throw this error, even while the error is simultaneously
presenting itself on the initial host. Copying the file to another file
name and then overwriting the original FileA.txt ‘fixes’ the problem.

The same mounts above were also exported to a Centos7.3 server (kernel
3.10.0)and the error did not occur: raising the user quota after a file
write caused a “Disk quota exceeded” allows subsequent writes to that
file with no further error messages.

Note 1: when the FS is mounted with the noac option this bug does not
occur. Conversely setting actimeo=0 does not fix the bug. The noac
option is a combination of the generic option sync, and the NFS-specific
option actimeo=0. Hence it appears that the issue is caused by the
default async, and setting noac forces sync and fixes the issue.

Note 2: inode quotas do not cause this issue and behave as expected.

Note 3: making use of soft vs hard quotas does not change the behaviour.
The issue occurs at the hard quota.

Note 4: looking at TCPDUMP the server is not passing error messages to
the client during this condition.


Setup:
SELinux and all firewalls disabled

exportfs -v from my xfs NFS server:
/opt/nfs
rocky8.client(sync,wdelay,hide,no_subtree_check,sec=sys,rw,secure,no_root_squash,no_all_squash)
/opt/nfs
centos7.server(sync,wdelay,hide,no_subtree_check,sec=sys,rw,secure,no_root_squash,no_all_squash)

All servers mentioned here:
cat /proc/fs/nfsd/versions
-2 +3 +4 +4.1 +4.2

For the xfs server setup:
acl client = 2.2.53-1.el8.1
acl server =  2.3.1-3.el9
libgssapi no such package in rocky
libevent client = 2.1.8-5.el8
libevent server = 2.1.12-6.el9
librpcsecgss no such package in rocky
nfs-utils client = 1:2.3.3-51.el8
nfs-utils server =1:2.5.4-10.el9
util-linux = 2.32.1-35.el8
util-linux = 2.37.4-3.el9



TCPDUMP:
We start dumping data to FileA.txt:

# cat data >> FileA.txt

16:42:39.788810 IP rocky8.client.943 > rocky9.server.nfs: Flags [.], seq
2497532:2498980, ack 4153, win 12282, options [nop,nop,TS val 2571582773
ecr 4267237588], length 1448
16:42:39.788822 IP rocky8.client.943 > rocky9.server.nfs: Flags [.], seq
2498980:2500428, ack 4153, win 12282, options [nop,nop,TS val 2571582773
ecr 4267237588], length 1448
16:42:39.788823 IP rocky9.server.nfs > rocky8.client.943: Flags [.], ack
2500428, win 24568, options [nop,nop,TS val 4267237589 ecr 2571582773],
length 0
16:42:39.788834 IP rocky8.client.943 > rocky9.server.nfs: Flags [.], seq
2500428:2501876, ack 4153, win 12282, options [nop,nop,TS val 2571582773
ecr 4267237588], length 1448

# cat data >> FileA.txt

16:42:39.788847 IP rocky8.client.943 > rocky9.server.nfs: Flags [.], seq
2501876:2503324, ack 4153, win 12282, options [nop,nop,TS val 2571582773
ecr 4267237588], length 1448
16:42:39.788849 IP rocky9.server.nfs > rocky8.client.943: Flags [.], ack
2503324, win 24568, options [nop,nop,TS val 4267237589 ecr 2571582773],
length 0
16:42:39.788856 IP rocky8.client.943 > rocky9.server.nfs: Flags [P.],
seq 2503324:2503872, ack 4153, win 12282, options [nop,nop,TS val
2571582773 ecr 4267237588], length 548

# cat data >> FileA.txt

16:42:39.788903 IP rocky9.server.nfs > rocky8.client.943: Flags [P.],
seq 4153:4253, ack 2503872, win 24568, options [nop,nop,TS val
4267237589 ecr 2571582773], length 100: NFS reply xid 4118676701 reply
ok 96 getattr ERROR: Disc quota exceeded
16:42:39.789416 IP rocky8.client.943 > rocky9.server.nfs: Flags [P.],
seq 2503872:2504072, ack 4253, win 12282, options [nop,nop,TS val
2571582775 ecr 4267237589], length 200: NFS request xid 4135453917 196
getattr fh 0,2/53
16:42:39.790175 IP rocky9.server.nfs > rocky8.client.943: Flags [P.],
seq 4253:4361, ack 2504072, win 24568, options [nop,nop,TS val
4267237590 ecr 2571582775], length 108: NFS reply xid 4135453917 reply
ok 104 getattr NON 3 ids 0/-530227613 sz 695948683
16:42:39.830384 IP rocky8.client.943 > rocky9.server.nfs: Flags [.], ack
4361, win 12282, options [nop,nop,TS val 2571582816 ecr 4267237590],
length 0



User ID      Used   Soft   Hard Warn/Grace
---------- ---------------------------------
alewis       3.9M     4M     4M  00 [------]


So now we grant the user more than enough space to continue extending
the file...

# setquota -u alewis 5M 5M 1000 1000 -a /opt/nfs
# xfs_quota -x -c 'report -h' /opt/nfs/

User quota on /opt/nfs (/dev/mapper/VGsplunk-LVsplunk)
User ID      Used   Soft   Hard Warn/Grace
---------- ---------------------------------
alewis       3.9M     5M     5M  00 [------]


But the error remains:

# cat data >> FileA.txt
cat: write error: Disk quota exceeded
<nothing in tcpdump>
alewis       4.0M     5M     5M  00 [------]

# cat data >> FileA.txt
# cat: write error: Disk quota exceeded
<nothing in tcpdump>
alewis       4.2M     5M     5M  00 [------]

# cat data >> FileA.txt
cat: write error: Disk quota exceeded
<nothing in tcpdump>
alewis       4.4M     5M     5M  00 [------]

Finally we exceed the new limit:
cat data >> FileA.txt
cat: write error: Input/output error
cat: write error: Disk quota exceeded
16:47:32.739902 IP rocky9.server.nfs > rocky8.client.943: Flags [P.],
seq 5185:5285, ack 1505904, win 24568, options [nop,nop,TS val
4267530540 ecr 2571875724], length 100: NFS reply xid 2726233309 reply
ok 96 getattr ERROR: Disc quota exceeded
________________________________
UNIVERSITY OF CAPE TOWN

This e-mail is subject to the UCT ICT policies and e-mail disclaimer published on our website at http://www.uct.ac.za/about/policies/emaildisclaimer/ or obtainable from +27 21 650 9111. This e-mail is intended only for the person(s) to whom it is addressed. If the e-mail has reached you in error, please notify the author. If you are not the intended recipient of the e-mail you may not use, disclose, copy, redirect or print the content. If this e-mail is not related to the business of UCT it is sent by the sender in the sender's individual capacity.





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux