Re: [Lsf-pc] [LSF/MM TOPIC] async buffered diskio read for userspace apps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 16, 2015 at 11:55 AM, Jeremy Allison <jra@xxxxxxxxx> wrote:
> On Fri, Jan 16, 2015 at 10:44:12AM -0500, Milosz Tanski wrote:
>> On Thu, Jan 15, 2015 at 5:31 PM, Jan Kara <jack@xxxxxxx> wrote:
>> > On Thu 15-01-15 12:43:23, Milosz Tanski wrote:
>> >> I would like to talk about enhancing the user interfaces for doing
>> >> async buffered disk IO for userspace applications. There's a whole
>> >> class of distributed web applications (most new applications today)
>> >> that would benefit from such an API. Most of them today rely on
>> >> cobbling one together in user space using a threadpool.
>> >>
>> >> The current in kernel AIO interfaces that only support DIRECTIO, they
>> >> were generally designed by and for big database vendors. The consensus
>> >> is that the current AIO interfaces usually lead to decreased
>> >> performance for those app.
>> >>
>> >> I've been developing a new read syscall that allows non-blocking
>> >> diskio read (provided that data is in the page cache). It's analogous
>> >> to what exists today in the network world with recvmsg with MSG_NOWAIT
>> >> flag. The work has been previously described by LWN here:
>> >> https://lwn.net/Articles/612483/
>> >>
>> >> Previous attempts (over the last 12+ years) at non-blocking buffered
>> >> diskio has stalled due to their complexity. I would like to talk about
>> >> the problem, my solution, and get feedback on the course of action.
>> >>
>> >> Over the years I've been building the low level guys of various "web
>> >> applications". That usually involves async network based applications
>> >> (epoll based servers) and the biggest pain point for the last 8+ years
>> >> has been async disk IO.
>> >   Maybe this topic will be sorted out before LSF/MM. I know Andrew had some
>> > objections about doc and was suggesting a solution using fincore() (which
>> > Christoph refuted as being racy). Also there was a pending question
>> > regarding whether the async read in this form will be used by applications.
>> > But if it doesn't get sorted out a short session on the pending issues
>> > would be probably useful.
>> >
>> >                                                                 Honza
>> > --
>> > Jan Kara <jack@xxxxxxx>
>> > SUSE Labs, CR
>>
>> I've spent the better part of yesterday wrapping up the first cut of
>> samba support to FIO so we can test a modified samba file server with
>> these changes in a few scenarios. Right now it's only sync but I hope
>> to have async in the future. I hope that by the time the summit rolls
>> around I'll have data to share from samba and maybe some other common
>> apps (node.js / twisted).
>
> Don't forget to share the code changes :-). We @ Samba would
> love to see them to keep track !

I have the first version of the FIO cifs support via samba in my fork
of FIO here: https://github.com/mtanski/fio/tree/samba

Right now it only supports sync mode of FIO (eg. can't submit multiple
outstanding requests) but I'm looking into how to make it work with
smb2 read/write calls with the async flag.

Additionally, I'm sure I'm doing some things not quite right in terms
of smbcli usage as it was a decent amount of trial and error to get it
to connect (esp. the setup before smbcli_full_connection). Finally, it
looks like the more complex api I'm using (as opposed to smbclient,
because I want the async calls) doesn't quite fully export all calls I
need via headers / public dyn libs so it's a bit of a hack to get it
to build: https://github.com/mtanski/fio/commit/7fd35359259b409ed023b924cb2758e9efb9950c#diff-1

But it works for my randread tests with zipf and the great part is
that it should provide a flexible way to test samba with many fake
clients and access patterns. So... progress.

-- 
Milosz Tanski
CTO
16 East 34th Street, 15th floor
New York, NY 10016

p: 646-253-9055
e: milosz@xxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux