Patch "btrfs: fix partial loss of prealloc extent past i_size after fsync" has been added to the 5.4-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    btrfs: fix partial loss of prealloc extent past i_size after fsync

to the 5.4-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     btrfs-fix-partial-loss-of-prealloc-extent-past-i_siz.patch
and it can be found in the queue-5.4 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit cb1cec4c34c10c70dc696014b2b0a721c4ae37d2
Author: Filipe Manana <fdmanana@xxxxxxxx>
Date:   Thu Apr 23 16:30:53 2020 +0100

    btrfs: fix partial loss of prealloc extent past i_size after fsync
    
    commit f135cea30de5f74d5bfb5116682073841fb4af8f upstream.
    
    When we have an inode with a prealloc extent that starts at an offset
    lower than the i_size and there is another prealloc extent that starts at
    an offset beyond i_size, we can end up losing part of the first prealloc
    extent (the part that starts at i_size) and have an implicit hole if we
    fsync the file and then have a power failure.
    
    Consider the following example with comments explaining how and why it
    happens.
    
      $ mkfs.btrfs -f /dev/sdb
      $ mount /dev/sdb /mnt
    
      # Create our test file with 2 consecutive prealloc extents, each with a
      # size of 128Kb, and covering the range from 0 to 256Kb, with a file
      # size of 0.
      $ xfs_io -f -c "falloc -k 0 128K" /mnt/foo
      $ xfs_io -c "falloc -k 128K 128K" /mnt/foo
    
      # Fsync the file to record both extents in the log tree.
      $ xfs_io -c "fsync" /mnt/foo
    
      # Now do a redudant extent allocation for the range from 0 to 64Kb.
      # This will merely increase the file size from 0 to 64Kb. Instead we
      # could also do a truncate to set the file size to 64Kb.
      $ xfs_io -c "falloc 0 64K" /mnt/foo
    
      # Fsync the file, so we update the inode item in the log tree with the
      # new file size (64Kb). This also ends up setting the number of bytes
      # for the first prealloc extent to 64Kb. This is done by the truncation
      # at btrfs_log_prealloc_extents().
      # This means that if a power failure happens after this, a write into
      # the file range 64Kb to 128Kb will not use the prealloc extent and
      # will result in allocation of a new extent.
      $ xfs_io -c "fsync" /mnt/foo
    
      # Now set the file size to 256K with a truncate and then fsync the file.
      # Since no changes happened to the extents, the fsync only updates the
      # i_size in the inode item at the log tree. This results in an implicit
      # hole for the file range from 64Kb to 128Kb, something which fsck will
      # complain when not using the NO_HOLES feature if we replay the log
      # after a power failure.
      $ xfs_io -c "truncate 256K" -c "fsync" /mnt/foo
    
    So instead of always truncating the log to the inode's current i_size at
    btrfs_log_prealloc_extents(), check first if there's a prealloc extent
    that starts at an offset lower than the i_size and with a length that
    crosses the i_size - if there is one, just make sure we truncate to a
    size that corresponds to the end offset of that prealloc extent, so
    that we don't lose the part of that extent that starts at i_size if a
    power failure happens.
    
    A test case for fstests follows soon.
    
    Fixes: 31d11b83b96f ("Btrfs: fix duplicate extents after fsync of file with prealloc extents")
    CC: stable@xxxxxxxxxxxxxxx # 4.14+
    Signed-off-by: Filipe Manana <fdmanana@xxxxxxxx>
    Signed-off-by: David Sterba <dsterba@xxxxxxxx>
    Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 6f2178618c22d..7d464b049507a 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -4242,6 +4242,9 @@ static int btrfs_log_prealloc_extents(struct btrfs_trans_handle *trans,
 	const u64 ino = btrfs_ino(inode);
 	struct btrfs_path *dst_path = NULL;
 	bool dropped_extents = false;
+	u64 truncate_offset = i_size;
+	struct extent_buffer *leaf;
+	int slot;
 	int ins_nr = 0;
 	int start_slot;
 	int ret;
@@ -4256,9 +4259,43 @@ static int btrfs_log_prealloc_extents(struct btrfs_trans_handle *trans,
 	if (ret < 0)
 		goto out;
 
+	/*
+	 * We must check if there is a prealloc extent that starts before the
+	 * i_size and crosses the i_size boundary. This is to ensure later we
+	 * truncate down to the end of that extent and not to the i_size, as
+	 * otherwise we end up losing part of the prealloc extent after a log
+	 * replay and with an implicit hole if there is another prealloc extent
+	 * that starts at an offset beyond i_size.
+	 */
+	ret = btrfs_previous_item(root, path, ino, BTRFS_EXTENT_DATA_KEY);
+	if (ret < 0)
+		goto out;
+
+	if (ret == 0) {
+		struct btrfs_file_extent_item *ei;
+
+		leaf = path->nodes[0];
+		slot = path->slots[0];
+		ei = btrfs_item_ptr(leaf, slot, struct btrfs_file_extent_item);
+
+		if (btrfs_file_extent_type(leaf, ei) ==
+		    BTRFS_FILE_EXTENT_PREALLOC) {
+			u64 extent_end;
+
+			btrfs_item_key_to_cpu(leaf, &key, slot);
+			extent_end = key.offset +
+				btrfs_file_extent_num_bytes(leaf, ei);
+
+			if (extent_end > i_size)
+				truncate_offset = extent_end;
+		}
+	} else {
+		ret = 0;
+	}
+
 	while (true) {
-		struct extent_buffer *leaf = path->nodes[0];
-		int slot = path->slots[0];
+		leaf = path->nodes[0];
+		slot = path->slots[0];
 
 		if (slot >= btrfs_header_nritems(leaf)) {
 			if (ins_nr > 0) {
@@ -4296,7 +4333,7 @@ static int btrfs_log_prealloc_extents(struct btrfs_trans_handle *trans,
 				ret = btrfs_truncate_inode_items(trans,
 							 root->log_root,
 							 &inode->vfs_inode,
-							 i_size,
+							 truncate_offset,
 							 BTRFS_EXTENT_DATA_KEY);
 			} while (ret == -EAGAIN);
 			if (ret)



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux