Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write

Nick Piggin <nickpiggin@xxxxxxxxxxxx> · Thu, 04 Jan 2007 17:50:11 +1100

Suparna Bhattacharya wrote:
On Thu, Jan 04, 2007 at 04:51:58PM +1100, Nick Piggin wrote:

So long as AIO threads do the same, there would be no problem (plugging
is optional, of course).

Yup, the AIO threads run the same code as for regular IO, i.e in the rare
situations where they actually end up submitting IO, so there should
be no problem. And you have already added plug/unplug at the appropriate
places in those path, so things should just work. 

Yes I think it should.

This (is supposed to) give a number of improvements over the traditional
plugging (although some downsides too). Most notably for me, the VM gets
cleaner ;)

However AIO could be an interesting case to test for explicit plugging
because of the way they interact. What kind of improvements do you see
with samba and do you have any benchmark setups?

I think aio-stress would be a good way to test/benchmark this sort of
stuff, at least for a start. 
Samba (if I understand this correctly based on my discussions with Tridge)
is less likely to generate the kind of io patterns that could benefit from
explicit plugging (because the file server has no way to tell what the next
request is going to be, it ends up submitting each independently instead of
batching iocbs).

OK, but I think that after IO submission, you do not run sync_page to
unplug the block device, like the normal IO path would (via lock_page,
before the explicit plug patches).

However, with explicit plugging, AIO requests will be started immediately.
Maybe this won't be noticable if the device is always busy, but I would
like to know there isn't a regression.

In future there may be optimization possibilities to consider when
submitting batches of iocbs, i.e. on the io submission path. Maybe
AIO - O_DIRECT would be interesting to play with first in this regardi ? 

Well I've got some simple per-process batching in there now, each process
has a list of pending requests. Request merging is done locklessly against
the last request added; and submission at unplug time is batched under a
single block device lock.

I'm sure more merging or batching could be done, but also consider that
most programs will not ever make use of any added complexity.

Regarding your patches, I've just had a quick look and have a question --
what do you do about blocking in page reclaim and dirty balancing? Aren't
those major points of blocking with buffered IO? Did your test cases
dirty enough to start writeout or cause a lot of reclaim? (admittedly,
blocking in reclaim will now be much less common since the dirty mapping
accounting).

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html