On 5/25/22 22:31, Vivek Goyal wrote:
On Fri, May 20, 2022 at 10:04:43AM +0530, Dharmendra Singh wrote:
From: Dharmendra Singh <dsingh@xxxxxxx>
In general, as of now, in FUSE, direct writes on the same file are
serialized over inode lock i.e we hold inode lock for the full duration
of the write request. I could not found in fuse code a comment which
clearly explains why this exclusive lock is taken for direct writes.
Our guess is some USER space fuse implementations might be relying
on this lock for seralization and also it protects for the issues
arising due to file size assumption or write failures. This patch
relaxes this exclusive lock in some cases of direct writes.
I have this question as well. My understanding was that in general,
reads can do shared lock while writes have to take exclusive lock.
And I assumed that extends to both buffered as well as direct
writes.
I would also like to understand what's the fundamental restriction
and why O_DIRECT is special that this restriction does not apply.
Is any other file system doing this as well?
If fuse server dir is shared with other fuse clients, it is possible
that i_size in this client is stale. Will that be a problem. I guess
if that's the problem then, even a single write will be a problem
because two fuse clients might be trying to write.
Just trying to make sure that it is safe to allow parallel direct
writes.
I think missing in this series is to add a comment when this lock is
needed at all. Our network file system is log structured - any parallel
writes to the same file from different remote clients are handled
through addition of fragments on the network server side - lockless safe
due to byte level accuracy. With the exception of conflicting writes -
last client wins - application is then doing 'silly' things - locking
would not help either. And random parallel writes from the same
(network) client are even an ideal case for us, as that is handled
through shared blocks for different fragments (file offset + len). So
for us shared writes are totally safe.
When Dharmendra and I discussed about the lock we came up with a few
write error handling cases where that lock might be needed - I guess
that should be added as comment.