Re: [RFCv2 0/4] ext4: bmap & fiemap conversion to use iomap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 1/30/20 11:04 PM, Ritesh Harjani wrote:


On 1/30/20 9:30 PM, Darrick J. Wong wrote:
On Tue, Jan 28, 2020 at 03:48:24PM +0530, Ritesh Harjani wrote:
Hello All,

Background
==========
There are RFCv2 patches to move ext4 bmap & fiemap calls to use iomap APIs. This reduces the users of ext4_get_block API and thus a step towards getting rid of buffer_heads from ext4. Also reduces a lot of code by making use of
existing iomap_ops (except for xattr implementation).

Testing (done on ext4 master branch)
========
'xfstests -g auto' passes with default mkfs/mount configuration
(v/s which also pass with vanilla kernel without this patch). Except
generic/473 which also failes on XFS. This seems to be the test case issue since it expects the data in slightly different way as compared to what iomap
returns.
Point 2.a below describes more about this.

Observations/Review required
============================
1. bmap related old v/s new method differences:-
    a. In case if addr > INT_MAX, it issues a warning and
       returns 0 as the block no. While earlier it used to return the
       truncated value with no warning.

Good...

    b. block no. is only returned in case of iomap->type is IOMAP_MAPPED,        but not when iomap->type is IOMAP_UNWRITTEN. While with previously
       we used to get block no. for both of above cases.

Assuming the only remaining usecase of bmap is to tell old bootloaders
where to find vmlinuz blocks on disk, I don't see much reason to map
unwritten blocks -- there's no data there, and if your bootloader writes
to the filesystem(!) then you can't read whatever was written there
anyway.

Yes, no objection there. Just wanted to get it reviewed.



Uh, can we put this ioctl on the deprecation list, please? :)

2. Fiemap related old v/s new method differences:-
    a. iomap_fiemap returns the disk extent information in exact
       correspondence with start of user requested logical offset till the
       length requested by user. While in previous implementation the
       returned information used to give the complete extent information if
       the range requested by user lies in between the extent mapping.

This is a topic of much disagreement.  The FIEMAP documentation says
that the call must return data for the requested range, but *may* return
a mapping that extends beyond the requested range.

XFS (and now iomap) only return data for the requested range, whereas
ext4 has (had?) the behavior you describe.  generic/473 was an attempt
to enforce the ext4 behavior across all filesystems, but I put it in my
dead list and never run it.

So it's a behavioral change, but the new behavior isn't forbidden.

Sure, thanks.


    b. iomap_fiemap adds the FIEMAP_EXTENT_LAST flag also at the last
       fiemap_extent mapping range requested by the user via fm_length (
       if that has a valid mapped extent on the disk).

That sounds like a bug.  _LAST is supposed to be set on the last extent
in the file, not the last record in the queried dataset.

Thought so too, sure will spend some time to try fixing it.

Looked into this. I think below should fix our above reported problem with current iomap code.
If no objection I will send send PATCHv3 with below fix as the first
patch in the series.

diff --git a/fs/iomap/fiemap.c b/fs/iomap/fiemap.c
index bccf305ea9ce..ee53991810d5 100644
--- a/fs/iomap/fiemap.c
+++ b/fs/iomap/fiemap.c
@@ -100,7 +100,12 @@ int iomap_fiemap(struct inode *inode, struct fiemap_extent_info *fi,
        }

        if (ctx.prev.type != IOMAP_HOLE) {
-               ret = iomap_to_fiemap(fi, &ctx.prev, FIEMAP_EXTENT_LAST);
+               u32 flags = 0;
+               loff_t isize = i_size_read(inode);
+
+               if (ctx.prev.offset + ctx.prev.length >= isize)
+                       flags |= FIEMAP_EXTENT_LAST;
+               ret = iomap_to_fiemap(fi, &ctx.prev, flags);
                if (ret < 0)
                        return ret;
        }


-ritesh





But if the user
       requested for more fm_length which could not be mapped in the last
       fiemap_extent, then the flag is not set.

Yes... if there were more extents to map than there was space in the map
array, then _LAST should remain unset to encourage userspace to come
back for the rest of the mappings.

(Unless maybe I'm misunderstanding here...)

e.g. output for above differences 2.a & 2.b
===========================================
create a file with below cmds.
1. fallocate -o 0 -l 8K testfile.txt
2. xfs_io -c "pwrite 8K 8K" testfile.txt
3. check extent mapping:- xfs_io -c "fiemap -v" testfile.txt
4. run this binary on with and without these patches:- ./a.out (test_fiemap_diff.c) [4]

o/p of xfs_io -c "fiemap -v"
============================================
With this patch on patched kernel:-
testfile.txt:
  EXT: FILE-OFFSET      BLOCK-RANGE          TOTAL FLAGS
    0: [0..15]:         122802736..122802751    16 0x800
    1: [16..31]:        122687536..122687551    16   0x1

without patch on vanilla kernel (no difference):-
testfile.txt:
  EXT: FILE-OFFSET      BLOCK-RANGE          TOTAL FLAGS
    0: [0..15]:         332211376..332211391    16 0x800
    1: [16..31]:        332722392..332722407    16   0x1


o/p of a.out without patch:-
================
riteshh-> ./a.out
logical: [       0..      15] phys: 332211376..332211391 flags: 0x800 tot: 16
(0) extent flag = 2048

o/p of a.out with patch (both point 2.a & 2.b could be seen)
=======================
riteshh-> ./a.out
logical: [       0..       7] phys: 122802736..122802743 flags: 0x801 tot: 8
(0) extent flag = 2049

FYI - In test_fiemap_diff.c test we had
a. fm_extent_count = 1
b. fm_start = 0
c. fm_length = 4K
Whereas when we change fm_extent_count = 32, then we don't see any difference.

e.g. output for above difference listed in point 1.b
====================================================

o/p without patch (block no returned for unwritten block as well)
=========Testing IOCTL FIBMAP=========
File size = 16384, blkcnt = 4, blocksize = 4096
   0   41526422
   1   41526423
   2   41590299
   3   41590300

o/p with patch (0 returned for unwritten block)
=========Testing IOCTL FIBMAP=========
File size = 16384, blkcnt = 4, blocksize = 4096
   0          0          0
   1          0          0
   2   15335942      29953
   3   15335943      29953


Summary:-
========
Due to some of the observational differences to user, listed above,
requesting to please help with a careful review in moving this to iomap.
Digging into some older threads, it looks like these differences should be fine, since the same tools have been working fine with XFS (which uses iomap based
implementation) [1]
Also as Ted suggested in [3]: Fiemap & bmap spec could be made based on the ext4 implementation. But since all the tools also work with xfs which uses iomap
based fiemap, so we should be good there.

<nod> Thanks for the worked example output. :)

Thanks for the review. :)

ritesh



--D


References of some previous discussions:
=======================================
[1]: https://www.spinics.net/lists/linux-fsdevel/msg128370.html
[2]: https://www.spinics.net/lists/linux-fsdevel/msg127675.html
[3]: https://www.spinics.net/lists/linux-fsdevel/msg128368.html
[4]: https://raw.githubusercontent.com/riteshharjani/LinuxStudy/master/tools/test_fiemap_diff.c
[RFCv1]: https://www.spinics.net/lists/linux-ext4/msg67077.html


Ritesh Harjani (4):
   ext4: Add IOMAP_F_MERGED for non-extent based mapping
   ext4: Optimize ext4_ext_precache for 0 depth
   ext4: Move ext4 bmap to use iomap infrastructure.
   ext4: Move ext4_fiemap to use iomap infrastructure

  fs/ext4/extents.c | 288 +++++++---------------------------------------
  fs/ext4/inline.c  |  41 -------
  fs/ext4/inode.c   |   6 +-
  3 files changed, 49 insertions(+), 286 deletions(-)

--
2.21.0





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux