On Tue, Jun 21, 2011 at 1:41 PM, Andi Kleen <andi@xxxxxxxxxxxxxx> wrote: > Dan Ehrenberg <dehrenberg@xxxxxxxxxx> writes: > >> This code introduces a fast-path variant of __blockdev_direct_IO >> for the special case where the request size is a multiple of the page >> size, the inode block size is a page, the user memory is page-aligned, >> the underlying storage is contiguous on disk and the file location is >> already initialized. The special case decreases the amount of >> bookkeeping required, which saves a significant amount of CPU time on >> a fast device such as a ramdisk or an SSD. The patch is inspired by >> earlier code by Ken Chen. > > Is it understood why your fast path is that much faster? > i.e. what's the slow part in the normal path that it avoids? > > I am wondering if some of the improvements could be gotten even for less > rigid pre conditions. I should start by saying that I really should've submitted this with an [RFC] tag. I'm eager for feedback on my first Linux kernel patch, and I'm really glad you responded. The slowness in the dio code that I have observed is not in any particular place, but rather a death of a thousand cuts. Lines like memset(dio, 0, offsetof(struct dio, pages)); show up as significant in the CPU profile, but so do other random lines that manipulate the struct dio. In an earlier version of the patch, I restricted the change to only page-sized operations. This was criticized for being insufficiently general. In generalizing to page-multiple operations, I noticed a minor regression, which seems to be from the IS_ALIGNED calls. You're right that these preconditions are rather rigid, though. If you have a suggestion for a more general precondition, I can try it out and see if it maintains the performance properties I want. > >> + /* >> + * The i_alloc_sem will be released at I/O completion, >> + * possibly in a different thread. >> + */ >> + down_read_non_owner(&inode->i_alloc_sem); > > There's just a patch kit posted from hch which removes that semaphore. > > -Andi Once this patch is finalized and merged, I can make a new version of the patch based on the new synchronization mechanism. Dan -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html