On 2/2/20 3:08 AM, chenqiwu wrote: > On Sun, Feb 02, 2020 at 12:09:50AM +0100, Bernd Schubert wrote: >> >> >> On 2/1/20 6:49 AM, qiwuchen55@xxxxxxxxx wrote: >>> From: chenqiwu <chenqiwu@xxxxxxxxxx> >>> >>> Apparently our current rwsem code doesn't like doing the trylock, then >>> lock for real scheme. So change our direct write method to just do the >>> trylock for the RWF_NOWAIT case. >>> This seems to fix AIM7 regression in some scalable filesystems upto ~25% >>> in some cases. Claimed in commit 942491c9e6d6 ("xfs: fix AIM7 regression") >>> >>> Signed-off-by: chenqiwu <chenqiwu@xxxxxxxxxx> >>> --- >>> fs/fuse/file.c | 8 +++++++- >>> 1 file changed, 7 insertions(+), 1 deletion(-) >>> >>> diff --git a/fs/fuse/file.c b/fs/fuse/file.c >>> index ce71538..ac16994 100644 >>> --- a/fs/fuse/file.c >>> +++ b/fs/fuse/file.c >>> @@ -1529,7 +1529,13 @@ static ssize_t fuse_direct_write_iter(struct kiocb *iocb, struct iov_iter *from) >>> ssize_t res; >>> >>> /* Don't allow parallel writes to the same file */ >>> - inode_lock(inode); >>> + if (iocb->ki_flags & IOCB_NOWAIT) { >>> + if (!inode_trylock(inode)) >>> + return -EAGAIN; >>> + } else { >>> + inode_lock(inode); >>> + } >>> + >>> res = generic_write_checks(iocb, from); >>> if (res > 0) { >>> if (!is_sync_kiocb(iocb) && iocb->ki_flags & IOCB_DIRECT) { >>> >> >> >> I would actually like to ask if we can do something about this lock >> altogether. Replace it with a range lock? This very lock badly hurts >> fuse shared file performance and maybe I miss something, but it should >> be needed only for writes/reads going into the same file? >> > I think replacing the internal inode rwsem with a range lock maybe not > a good idea, because it may cause potential block for different writes/reads > routes when this range lock is owned by someone. Using internal inode rwsem > can avoid this range racy. > So your 2nd patch changes to rw-locks and should solve low read direct-io performance, but single shared file writes is still an issue. For network file systems it also common to globally enforce fuse direct-io to reduce/avoid cache coherency issues - the application typically doesn't ask for that on its own. And that is where this lock is badly hurting. Hmm, maybe we should differentiate between fuse-internal direct-io and application direct-io requests here? Or we need a range lock,that supports shared readers (I haven't looked at any of the proposed range lock patches yet (non has landed yet, right?). Thanks, Bernd