Re: [RFC PATCH 0/7] Non-blockling buffered fs read (page cache only)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sep 15, 2014, at 2:20 PM, Milosz Tanski <milosz@xxxxxxxxx> wrote:

> This patcheset introduces an ability to perform a non-blocking read
> from regular files in buffered IO mode. This works by only for those
> filesystems that have data in the page cache.
> 
> It does this by introducing new syscalls new syscalls readv2/writev2
> and preadv2/pwritev2. These new syscalls behave like the network sendmsg,
> recvmsg syscalls that accept an extra flag argument (O_NONBLOCK).

It's too bad that we are introducing yet another new read/write
syscall pair that only allow IO into discontiguous memory regions,
but do not allow a single call to access discontiguous file regions
(i.e. specify a separate file offset for each iov).

Adding syscalls similar to preadv/pwritev() that could take a iovec
that specified the file offset+length in addition to the memory address
would allow efficient scatter-gather IO in a single syscall.  While
that is less critical for local filesystems with small syscall latency,
it is more important for network filesystems, or in the case of
NVRAM-backed filesystems.

Cheers, Andreas

> It's a very common patern today (samba, libuv, etc..) use a large
> threadpool to perform buffered IO operations. They submit the work
> form another thread that performs network IO and epoll or other threads
> that perform CPU work. This leads to increased latency for processing,
> esp. in the case of data that's already cached in the page cache.
> 
> With the new interface the applications will now be able to fetch the
> data in their network / cpu bound thread(s) and only defer to a
> threadpool if it's not there. In our own application (VLDB) we've
> observed a decrease in latency for "fast" request by avoiding unnecessary
> queuing and having to swap out current tasks in IO bound work threads.
> 
> I have co-developed these changes with Christoph Hellwig, a whole lot
> of his fixes went into the first patch in the series (were squashed
> with his approval).
> 
> I am going to post the perf report in a reply-to to this RFC.
> 
> Christoph Hellwig (3):
>  documentation updates
>  move flags enforcement to vfs_preadv/vfs_pwritev
>  check for O_NONBLOCK in all read_iter instances
> 
> Milosz Tanski (4):
>  Prepare for adding a new readv/writev with user flags.
>  Define new syscalls readv2,preadv2,writev2,pwritev2
>  Export new vector IO (with flags) to userland
>  O_NONBLOCK flag for readv2/preadv2
> 
> Documentation/filesystems/Locking |    4 +-
> Documentation/filesystems/vfs.txt |    4 +-
> arch/x86/syscalls/syscall_32.tbl  |    4 +
> arch/x86/syscalls/syscall_64.tbl  |    4 +
> drivers/target/target_core_file.c |    6 +-
> fs/afs/internal.h                 |    2 +-
> fs/afs/write.c                    |    4 +-
> fs/aio.c                          |    4 +-
> fs/block_dev.c                    |    9 ++-
> fs/btrfs/file.c                   |    2 +-
> fs/ceph/file.c                    |   10 ++-
> fs/cifs/cifsfs.c                  |    9 ++-
> fs/cifs/cifsfs.h                  |   12 ++-
> fs/cifs/file.c                    |   30 +++++---
> fs/ecryptfs/file.c                |    4 +-
> fs/ext4/file.c                    |    4 +-
> fs/fuse/file.c                    |   10 ++-
> fs/gfs2/file.c                    |    5 +-
> fs/nfs/file.c                     |   13 ++--
> fs/nfs/internal.h                 |    4 +-
> fs/nfsd/vfs.c                     |    4 +-
> fs/ocfs2/file.c                   |   13 +++-
> fs/pipe.c                         |    7 +-
> fs/read_write.c                   |  146 +++++++++++++++++++++++++++++++------
> fs/splice.c                       |    4 +-
> fs/ubifs/file.c                   |    5 +-
> fs/udf/file.c                     |    5 +-
> fs/xfs/xfs_file.c                 |   12 ++-
> include/linux/fs.h                |   16 ++--
> include/linux/syscalls.h          |   12 +++
> include/uapi/asm-generic/unistd.h |   10 ++-
> mm/filemap.c                      |   34 +++++++--
> mm/shmem.c                        |    6 +-
> 33 files changed, 306 insertions(+), 112 deletions(-)
> 
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Cheers, Andreas





Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux