On 11/3/2021 8:22 PM, Matthew Wilcox (willy@xxxxxxxxxxxxx) wrote: > On Wed, Nov 03, 2021 at 11:43:20PM +0000, David Howells wrote: >> Currently, at the completion of a storage RPC from writepages, the errors >> ENOSPC, EDQUOT, ENOKEY, EACCES, EPERM, EKEYREJECTED and EKEYREVOKED cause >> the pages involved to be redirtied and the write to be retried by the VM at >> a future time. >> >> However, this is probably not the right thing to do, and, instead, the >> writes should be discarded so that the system doesn't get blocked (though >> unmounting will discard the uncommitted writes anyway). > umm. I'm not sure that throwing away the write is the best answer > for some of these errors. Our whole story around error handling in > filesystems, the page cache and the VFS is pretty sad, but I don't think > that this is the right approach. > > Ideally, we'd hold onto the writes in the page cache until (eg for ENOSPC > / EDQUOT), the user has deleted some files, then retry the writes. Hi Matthew, I agree that it would be desirable to avoid discarding user data but in practice that is hard to do. The proposed behavior change is consistent with other Unix AFS/AuriStorFS cache manager implementations. There are many situations which can result in an out of quota or out of space error where the end user has absolutely no ability to do anything about it. An EDQUOT error might occur because the AFS volume has reached its quota. However, the writer only has insert privilege and cannot delete. The user might not even be able to list the contents of the volume. An ENOSPC error might be the result of the backing store for AFS vice partitions filling due to data being written to other AFS volumes that the writer has no ability to access or manage. AFS cache managers frequently implement write-on-close semantics and will flush dirty content to the fileserver only when the file is closed or the local cache is out-of-space. Holding onto dirty data that cannot be flushed to the server on a multi-user timeshare system can result on unwanted negative impacts on other users of the system. Another risk is that if dirty data persists locally that the EDQUOT/ENOSPC errors will be replaced by EACCES or EPERM errors when the associated authentication credentials expire. If a back-off strategy is to be implemented in the future, AFS does provide RPCs that can be used to query the volume's online status, the maximum quota in one KiB blocks, the blocks in use, the available blocks in the partition, and the maximum number of blocks in the partition. Querying RXAFS_GetVolumeStatus or RXYFS_GetVolumeStatus can avoid the overhead of issuing a StoreData operation that is likely to fail. Jeffrey Altman
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature