On Thu, Apr 25, 2019 at 10:04:34AM +0000, David Laight wrote: > From: Kirill Smelkov > > Sent: 24 April 2019 19:30 > > > > On Wed, Apr 24, 2019 at 10:26:55AM -0700, Linus Torvalds wrote: > > > On Wed, Apr 24, 2019 at 10:19 AM Sasha Levin <sashal@xxxxxxxxxx> wrote: > > > > > > > > Hm, I might be confusing something here but I see a bunch of patches > > > > that convert existing callers mentioned in this patch to use > > > > stream_open() which was introduced here. > > > > > > The only use of stream_open() upstream right now is the xenbus > > > conversion, and that isn't actually a bugfix, because xenbus used to > > > manually do that > > > > > > filp->f_mode &= ~FMODE_ATOMIC_POS; /* cdev-style semantics */ > > > > > > that stream_open() does. > > > > > > So no, there isn't "a bunch of patches" anywhere. > > > > > > There are *future* cleanups for 5.2 that will happen, and that might > > > have hit linux-next. And there is at least one FUSE patch (again - > > > pending, not upstream) that may get marked for stable. > > > > > > But I see nothing right now that makes it stable material yet. > > > > Linus, thanks for explaining. Sasha, Greg, there is a FUSE patch that > > should be stable material that will need this stream_open() thing. That > > patch has just entered fuse.git#for-next today: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git/commit/?id=bbd84f33652f > > > > and will hopefully enter 5.2 when merge window opens. I agree we should > > not blindly backport bulk stream_open conversions as performed by > > stream_open.cocci, at least unless there is a bug report indicating that > > it is actually required for a particular driver. On the other hand both > > Xen and FUSE deadlocks were hit for real which justifies stable > > propagation for their fixes. > > > > You can read about the deadlock regression and the plan to fix it in > > original "fs: stream_open - opener for stream-like files so that read > > and write can run simultaneously without deadlock" patch (the 59/66 > > patch that was send in this thread), or here: > > > > https://git.kernel.org/linus/10dce8af3422 > > > > > > Hope it clarifies things a bit, > > I can also imagine drivers that expect accesses to be done using > pread() and pwrite() - maybe only if the fd is shared. > Provided accesses get the correct offset they can be concurrent. > In fact they only need to update the offset in the file structure > when they complete - they may do this already. > > I know (I think) uclibc implementing pread() as lseek() + read() > caused me grief - but that might just have been the extra system > call overhead rather than any problems with the offset. I'm not sure I understand your comment completely, but we convert to stream_open only drivers that actually do _not_ use position at all, and that were already using nonseekable_open, thus pread and pwrite were already returning -ESPIPE for them (nonseekable_open clears FMODE_{PREAD,PWRITE} and ksys_{pread,pwrite}64 check for that flag). We also convert only drivers that use no_llseek for .llseek, so lseek on those files is/was always returning -ESPIPE as well. If a driver uses position in its read and write and has support for pread/pwrite (FMODE_PREAD and FMODE_PWRITE), pread and pwrite are already working _without_ file->f_pos locking - because those system calls do not semantically update file->f_pos at all and thus do not take file->f_pos_lock - i.e. pread/pwrite can be run simultaneously already. If libc implements pread as lseek+read it will work for a single user case (single thread, or fd not shared between processes), but it will break because of lseek+read non-atomicity if multiple preads are simultaneously used from several threads. And also for such emulation for multiple users case there is a chance for pread vs pwrite deadlock, since those system calls are using read and write and read and write take file->f_pos_lock. Kirill