It would be more clear if I update the reproducer like this:
nfs server | nfs client
--------------------------------- |---------------------------------
# No space left on server |
fallocate -l 100G /server/nospace |
| mount -t nfs $nfs_server_ip:/ /mnt
|
| # Expected error
| dd if=/dev/zero of=/mnt/file
|
| # Release space on mountpoint
| rm /mnt/nospace
|
| # Unexpected error
| dd if=/dev/zero of=/mnt/file
The Unexpected error (No space left on device) when doing second `dd`,
is from unconsumed writeback error after close() the file when doing
first `dd`. There is enough space when doing second `dd`, we should not
report the nospace error.
We should report and consume the writeback error when userspace call
close()->flush(), the writeback error should not be left for next open().
Currently, fsync() will consume the writeback error while calling
file_check_and_advance_wb_err(), close()->flush() should also consume
the writeback error.
在 2022/3/6 0:53, Trond Myklebust 写道:
'rm' doesn't open any files or do any I/O, so it shouldn't be returning
any errors from the page cache.
IOW: The problem here is not that we're failing to clear an error from
the page cache. It is that something in 'rm' is checking the page cache
and returning any errors that it finds there.
Is 'rm' perhaps doing a stat() on the file it is deleting? If so, does
this patch fix the bug?
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/comm
it/?id=d19e0183a883