Hi, Especially with buffered io it's fairly easy to hit contention on the inode lock, during writes. With something like io_uring, it's even easier, because it currently (but see [1]) farms out buffered writes to workers, which then can easily contend on the inode lock, even if only one process submits writes. But I've seen it in plenty other cases too. Looking at the code I noticed that several parts of the "nowait aio support" (cf 728fbc0e10b7f3) series introduced code like: static ssize_t ext4_file_write_iter(struct kiocb *iocb, struct iov_iter *from) { ... if (!inode_trylock(inode)) { if (iocb->ki_flags & IOCB_NOWAIT) return -EAGAIN; inode_lock(inode); } isn't trylocking and then locking in a blocking fashion an inefficient pattern? I.e. I think this should be if (iocb->ki_flags & IOCB_NOWAIT) { if (!inode_trylock(inode)) return -EAGAIN; } else inode_lock(inode); Obviously this isn't going to improve scalability to a very significant degree. But not unnecessarily doing two atomic ops on a contended lock can't hurt scalability either. Also, the current code just seems confusing. Am I missing something? Greetings, Andres Freund [1] https://lore.kernel.org/linux-block/20190910164245.14625-1-axboe@xxxxxxxxx/