Re: [patch v3] splice: fix race with page invalidation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tuesday 05 August 2008 01:29, Jamie Lokier wrote:
> Nick Piggin wrote:
> > On Saturday 02 August 2008 04:28, Miklos Szeredi wrote:
> > > On Fri, 1 Aug 2008, Nick Piggin wrote:
> > > > Well, a) it probably makes sense in that case to provide another mode
> > > > of operation which fills the data synchronously from the sender and
> > > > copys it to the pipe (although the sender might just use read/write)
> > > > And b) we could *also* look at clearing PG_uptodate as an
> > > > optimisation iff that is found to help.
> > >
> > > IMO it's not worth it to complicate the API just for the sake of
> > > correctness in the so-very-rare read error case.  Users of the splice
> > > API will simply ignore this requirement, because things will work fine
> > > on ext3 and friends, and will break only rarely on NFS and FUSE.
> > >
> > > So I think it's much better to make the API simple: invalid pages are
> > > OK, and for I/O errors we return -EIO on the pipe.  It's not 100%
> > > correct, but all in all it will result in less buggy programs.
> >
> > That's true, but I hate how we always (in the VM, at least) just brush
> > error handling under the carpet because it is too hard :(
> >
> > I guess your patch is OK, though. I don't see any reasons it could cause
> > problems...
>
> At least, if there are situations where the data received is not what
> a common sense programmer would expect (e.g. blocks of zeros, data
> from an unexpected time in syscall sequence, or something, or just
> "reliable except with FUSE and NFS"), please ensure it's documented in
> splice.txt or wherever.

Not quite true. Many filesystems can return -EIO, and truncate can
partially zero pages.

Basically the man page should note that until the splice API is
improved, then a) -EIO errors will be seen at the receiever, b)
the pages can see transient zeroes (this is the case with read(2)
as well, but splice has a much bigger window), and c) the sender
does not send a snapshot of data because it can still be modified
until it is recieved.

c is not too surprising for an asynchronous interface, but it is
nice to document in case people are expecting COw or something.
b and c can more or less be worked around by not doing silly things
like truncating or scribbling on data until reciever really has it.
a, I argue, should be fixed in API.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux