[PATCH 8/8] xfs: bufferheads are not needed in ->writepage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Dave Chinner <dchinner@xxxxxxxxxx>

TO get rid of bufferheads from the writepage path, we have to get
rid of the bufferhead chaining that is done in the ioends to keep
track of the blocks under IO. We also mark the page clean indirectly
through bufferhead IO completion callbacks.

To move away from bufferheads, we need to track bios rather than
bufferheads, and on ioend completion we need to mark pages clean
directly. This makes it "interesting" for filesystems with sub-page
block size, because the bufferheads are used to track sub-page dirty
state. That is, only when all the bufferheads are clean is the page
marked clean. For now, we will ignore the sub-page block
size problem and address the block size = page size configuration
first. Once the bio/page handling infrastructure is in place we can
add support of sub-page block sizes.

Right now an xfs_ioend tracks a sequential region via a bufferhead
chain that is, at IO submission, converted to bios and then
submitted. A single xfs_ioend may require multiple bios to be
submitted, and so the ioend keeps a reference count of the number of
bios it needs completions from before it can process the IO
completion of the bufferhead chain across that region.

As such, we have a dual layer IO submission/completion process.
Assumming block size = page size, what we have is this:

pages		+-+-+-+-+-+-+-+-+-+
bufferhead	+-+-+-+-+-+-+-+-+-+
xfs_ioend	+eeeeeee+eeeeeeeee+
bios		+bbb+bbb+bbbbbb+bb+

So IO submission looks like:

	- .writepage is given a page
	- XFS creates an ioend or pulls the existing one from the
	  writepage context,
	- XFS walks the bufferheads on the page and adds the
	  bufferheads to it.
	- XFS will chains ioends together when some kind of IO
	  discontiguity occurs
	- When all the page walks are complete, XFS "submits" the
	  ioend
	- XFS walks the bufferheads, marking them as under async
	  writeback
	- XFS walks the bufferheads again, building bios from the
	  pages backing the bufferheads. When bios are too large to
	  have more pages added to them or there is a discontinuity
	  in the IO mapping, the bio is submitted and anew one is
	  started.

On IO completion:

	- xfs grabs the ioend from the bio, drops the bio and
	  decrements the reference count on the ioend.
	- ioend reference count goes to zero, runs endio callbacks
	  (e.g. size update, unwritten extent conversion).
	- ioend is destroyed
	- destroy walks bufferhead chain on ioend, calling
	  bufferhead IO completion
	- bufferhead IO completion calls page_end_writeback
	  appropriately.

IOWs, the xfs_ioend is really a mapping layer between bufferheads
and bios, and the bufferheads kind of hide us from pages in the
IO submission path.

To get rid of bufferheads, we have to get rid of the dependency on
bufferhead chaining for building bios and marking pages clean on IO
completion. What we really want is this:

pages		+-+-+-+-+-+-+-+-+-+
xfs_ioend	+eeeeeee+eeeeeeeee+
bios		+bbb+bbb+bbbbbb+bb+

And for us to be able to hold on to the bios being completed until
they are all done before we start ioend processing. It looks like we
can use chaining via the bi_private field (i.e. a single linked
list) to attach all the bios to the ioend prior to submission, we
replace that with a reference count and apointer to the ioend during
submission, and then rebuild the chain during IO completion.  We
then don't drop the bio references until we destroy the ioend, after
we've walked all the pages held by the bios and ended writeback on
them.

This will also handle sub-page block sizes that may require multiple
bios to clean a page as long as submission always creates page
granularity ioends.

Hence IO submission should look like:

	- .writepage is given a page
	- XFS creates an ioend or pulls the existing one from the
	  writepage context
	- XFS grabs the iomap from from the wpc or gets a new one
	- XFS checks page is adjacent to previous. Yes, checks
	  mapping is valid.  No to either, grabs new iomap, create
	  new bio, chain bio to ioend.  Then add page to bio,
	  mark page as under io.
	- When all the page walks are complete, XFS "submits" the
	  ioend
	- XFS walks the bio chain, removing them, taking references
	  to the ioend, bi_private = ioend, and then submitting i
	  them in order.

On IO completion:

	- xfs grabs the ioend for the bio, chains the bio back to
	  the ioend. Stashes the error in the ioend. drops the
	  refernce to the ioend.
	- ioend reference count goes to zero, runs endio callbacks
	  (e.g. size update, unwritten extent conversion).
	- ioend is destroyed
	- destroy walks the bio chain, calling page_end_writeback()
	  on the pages within, dropping bio references to free them.

Simples, yes?

In a few patches time, writepage will no longer have any bufferheads
in it. However, until we get rid of bufferheads completely, we still
need to make sure their state reflects the page state. Hence as a
stop-gap measure, the ioend bio submission and destruction will need
to walk the buffers on the pages and change their state
appropriately. This will be a wart on the side that will get removed
when bufferheads are removed from the other buffered IO paths in
XFS.

Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
---
 fs/xfs/xfs_aops.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 08a0205..e52eb0e 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -36,6 +36,7 @@
 #include <linux/pagevec.h>
 #include <linux/writeback.h>
 
+
 /*
  * structure owned by writepages passed to individual writepage calls
  */
-- 
2.5.0

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs



[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux