[PATCH][try4] fs: Add hooks for get_hole_size to generic_block_fiemap

Bob Peterson <rpeterso@xxxxxxxxxx> · Thu, 4 Sep 2014 20:04:25 -0400 (EDT)

Hi,

This version uses a new buffer flag, holesize, as Dave Chinner
suggested.

The problem:
If you do a fiemap operation on a very large sparse file, it can take
an extremely long amount of time (we're talking days here) because
function __generic_block_fiemap does a block-for-block search when it
encounters a hole.

The solution:
Allow the underlying file system to return the hole size so that function
__generic_block_fiemap can quickly skip the hole.

Preamble:
In cases where the fs-specific block_map() function finds a hole, it
can return the hole size in b_size. This is efficient because the file
system doesn't need to figure out block mapping a second time to
determine the hole size. The patch uses a new buffer_holesize flag
to tell when the fs-specific block_map() is passing back the hole_size:
If the fs-specific block_map() doesn't set the buffer_holesize bit,
function __generic_block_fiemap() assumes a hole size of 1 as before.
Other file systems that want to take advantage of the new "hole size"
functionality need only write their own function to determine the
hole size, call it from their respective block_map() function, and
set_buffer_holesize to put it into use. I've written a simple patch to
GFS2 that does just that, as a follow-on.

Patch description:

This patch changes function __generic_block_fiemap so that if the
fs-specific block_map sets the buffer_holesize flag corresponding to a
hole, it takes the returned b_size to be the size of the hole, in
bytes. This is much faster than trying each block individually when
large holes are encountered.

Regards,

Bob Peterson
Red Hat File Systems

Signed-off-by: Bob Peterson <rpeterso@xxxxxxxxxx> 
---
 fs/ioctl.c                  | 7 ++++++-
 include/linux/buffer_head.h | 2 ++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/ioctl.c b/fs/ioctl.c
index 8ac3fad..121ba6f 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -291,13 +291,18 @@ int __generic_block_fiemap(struct inode *inode,
 		memset(&map_bh, 0, sizeof(struct buffer_head));
 		map_bh.b_size = len;
 
+		clear_buffer_holesize(&map_bh);
 		ret = get_block(inode, start_blk, &map_bh, 0);
 		if (ret)
 			break;
 
 		/* HOLE */
 		if (!buffer_mapped(&map_bh)) {
-			start_blk++;
+			if (buffer_holesize(&map_bh))
+				start_blk += logical_to_blk(inode,
+							    map_bh.b_size);
+			else
+				start_blk++;
 
 			/*
 			 * We want to handle the case where there is an
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 324329c..39ed1f1 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -37,6 +37,7 @@ enum bh_state_bits {
 	BH_Meta,	/* Buffer contains metadata */
 	BH_Prio,	/* Buffer should be submitted with REQ_PRIO */
 	BH_Defer_Completion, /* Defer AIO completion to workqueue */
+	BH_Holesize,    /* Hole encountered, hole size returned */
 
 	BH_PrivateStart,/* not a state bit, but the first bit available
 			 * for private allocation by other entities
@@ -128,6 +129,7 @@ BUFFER_FNS(Boundary, boundary)
 BUFFER_FNS(Write_EIO, write_io_error)
 BUFFER_FNS(Unwritten, unwritten)
 BUFFER_FNS(Meta, meta)
+BUFFER_FNS(Holesize, holesize)
 BUFFER_FNS(Prio, prio)
 BUFFER_FNS(Defer_Completion, defer_completion)
 
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html