On 6/28/22 20:35, Dave Chinner wrote:
How do you explain it the API
semantics to an app developer that might want to use this
functionality? RENAME_EXCHANGE_WITH_NEWER would be atomic in the
sense you either get the old or new file at the destination, but
it's not atomic in the sense that it is serialised against all other
potential modification operations against either the source or
destination. Hence the "if newer" comparison is not part of the
"atomic rename" operation that is supposedly being performed...
So the current proposal based on feedback is to move the mtime
comparison to vfs_rename() to take advantage of existing
{lock,unlock}_two_nondirectories critical section, then nest another
critical section {deny,allow}_write_access (adapted to inodes) to
stabilize the mtime. The proposed use case never needs to compare
mtimes of files that are open for write, and the plan would be to return
-ETXTBSY in this case.
I'm also sceptical of the use of mtime - we can't rely on mtime to
determine the newer file accurately on all filesystems. e.g. Some
fileystems only have second granularity in their timestamps, so
there's a big window where "newer" cannot actually be determined by
timestamp comparisons.
So in the "use a directory as a key/value store" use case in distributed
systems, the file mtime is generally determined remotely by the file
content creator and is set locally via futimens() rather than the local
system clock. So this gives you nanosecond scale time resolution if the
content creator supports it, even if the system clock has less resolution.
James