On Sat, Apr 11, 2015 at 08:30:40AM +1000, Dave Chinner wrote: > On Fri, Apr 10, 2015 at 04:21:47PM -0400, Brian Foster wrote: > > On Fri, Apr 10, 2015 at 11:37:57PM +1000, Dave Chinner wrote: > > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > > > > > Now we have an ioend being passed unconditionally to the direct IO > > > write completion context, we can pass a preallocated transaction > > > handle for on-disk inode size updates that are run in completion. > > > > > > At this point we really need to be passing the correct block range > > > that the IO spans through the ioend, so calculate the last block in > > > the mapping before we map the allocated range and use that instead > > > of the size desired by the direct IO. > > > > > > This enables us to keep track of multiple get-blocks calls in the > > > same direct IO - the ioend will keep coming back to us, and we can > > > keep extending it's range as new allocations and mappings are done. > > > > > > There are some new trace points added for debugging, and a small > > > hack to actually make the tracepoints work (enums in tracepoints > > > that use __print_symbolic don't work correctly) that should be fixed > > > in the 4.1 merge window. THis hack can be removed when the > > > tracepoint fix is upstream. > > > > > > There are lots of comments explaining the intricacies of passing the > > > ioend and append transaction in the code; they are better placed in > > > the code because we're going to need them to understand why this > > > code does what it does in a few years time.... > > > > > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> > > > --- > > > > I still need to look at this one (and grok the dio code more)... but an > > initial question: is this multiple get_blocks() call aggregation a > > requirement for the append ioend mechanism or an optimization? If the > > latter, I think a separate patch is more appropriate... > > Requirement. Direct Io is a twisty maze of passages loaded with > deadly traps. e.g. non AIO path: > > ->direct_IO > alloc dio(off, len) > loop until all IO issued { > get_blocks > dio->private = bh_result->b_private > build bio > dio->ref++ > submit bio > } > > dio_await_completion(dio) > dio_complete(dio) > dio->ref-- => goes to zero > dio->end_io(filp, off, len, dio->private) > xfs_end_io_direct_write(... off, len, ioend) > > > So, essentially, for as many bios that are mapped and submitted for > the direct IO, there is only one end IO completion call for the > entire IO. The multiple mappings we make need to aggregate the state > of the entire IO, not just the single instance.... > Ok, thanks for the breakdown. Essentially, we need to track the highest precedent I/O type of the overall DIO with respect to the completion handler. The patch itself is not hard to follow, but the dio path is a different beast. What I didn't quite catch when first playing with this is the mapping size optimization earlier in the get_blocks call that effectively defeats some of this by reducing the need for multiple calls in many cases. A single DIO write over a range of alternating map types (e.g., alternating preallocated blocks and holes), for example, is a better way to trigger the ioend aggregation. Additional comments to follow... Brian > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs