[RFC] unifying write variants for filesystems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jan 18, 2014 at 08:10:31PM +0000, Al Viro wrote:

> Ouch...  No, I hadn't meant that kind of insanity, but I'd missed the
> problem with scarcity of mappings completely...

OK, that pretty much kills this approach.  Pity...

Folks, what do you think about the following:
	* a new data structure:
struct io_source {
	enum {IO_IOVEC, IO_PVEC} type;
	union {
		struct iovec *iov;
		struct pvec {
			struct page *page;
			unsigned offset;
			unsigned size;
		} *pvec;
	};
}
	* a new method that would look like aio_write, but take
struct io_source instead of iovec.
	* store the type in iov_iter (normally - IO_UIOVEC) and teach the
code dealing with it to do the right thing depending on type.  I.e. instead
of __copy_from_user_inatomic() do kmap_atomic()/memcpy()/kunmap_atomic() if
it's a IO_PAGEVEC.
	* generic_file_aio_write() analog for new method, converging with
generic_file_aio_write() almost immediately (basically, as soon as iov_iter
has been initialized).
	* new_aio_write() consisting of
{
	struct io_source source = {.type = IO_UIOVEC, .user = iov};
	return file->f_op-><new_method>(iocb, &source, nr_segs, pos);
}
	* new_sync_write(), doing what do_sync_write() does for files
that have new_aio_write() as ->aio_write().
	* new_splice_write() usable for files that provide that method -
it would collect pipe_buffers, put together struct pvec array and pass
it to that method.  All mapping the pages would happen one-by-one
and only around actual copying the data.  And, of course, the locking
would be identical to what we do for write()/writev()/aio write

	Then filesystems can switch to that new method, turning their
flipping their aio_write() instances to new type and replacing ->aio_write
with default_aio_write, ->write with new_write and ->splice_write with
new_splice_write.

	Actually, there's a possibility that it would be possible to use
it for *all* instances of ->splice_write() - we'd need to store something
a pointer to "call this to try and steal this page" function in pvec
and allow the method do actual stealing.  Note that pipe_buffer ->steal()
only uses the page argument - they all ignore which pipe it's in (and
there's nothing they could usefully do if they knew which pipe had it been
in the first place).

	This is very preliminary, of course, and I might easily miss
something - the previous idea was unworkable, after all.  Comments
would be very welcome...
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux