Re: [PATCH v2] xfs: do not unshare any blocks beyond eof

Julian Sun <sunjunchao2870@xxxxxxxxx> · Sun, 29 Sep 2024 16:30:12 +0800

Hi, Darrick. Thanks for your review and comments.

Darrick J. Wong <djwong@xxxxxxxxxx> 于2024年9月28日周六 12:44写道：
>
> [cc linux-xfs]

Sorry for missing the CC to the XFS mailing list...
>
> On Fri, Sep 27, 2024 at 02:53:44PM +0800, Julian Sun wrote:
> > Attempting to unshare extents beyond EOF will trigger
> > the need zeroing case, which in turn triggers a warning.
> > Therefore, let's skip the unshare process if blocks are
> > beyond EOF.
> >
> > This patch passed the xfstests using './check -g quick', without
> > causing any additional failure
> >
> > Reported-and-tested-by: syzbot+296b1c84b9cbf306e5a0@xxxxxxxxxxxxxxxxxxxxxxxxx
> > Closes: https://syzkaller.appspot.com/bug?extid=296b1c84b9cbf306e5a0
> > Fixes: 32a38a499104 ("iomap: use write_begin to read pages to unshare")
> > Inspired-by: Dave Chinner <david@xxxxxxxxxxxxx>
> > Signed-off-by: Julian Sun <sunjunchao2870@xxxxxxxxx>
> > ---
> >  fs/xfs/xfs_iomap.c | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> > index 72c981e3dc92..81a0514b8652 100644
> > --- a/fs/xfs/xfs_iomap.c
> > +++ b/fs/xfs/xfs_iomap.c
> > @@ -976,6 +976,7 @@ xfs_buffered_write_iomap_begin(
>
> I'm unsure about why this correction is in
> xfs_buffered_write_iomap_begin.  If extent size hints are enabled, this
> function delegates to xfs_direct_write_iomap_begin, which means that
> this isn't a complete fix.

My understanding is that xfs_get_extsz_hint() return a nonzero value
only involving realtime dev/inodes, right?
If that is true, and reflink is not supproted with realtime devices,
then fallocate unshare will directly return 0 within
xfs_reflink_unshare() for rt dev/inodes. Furthermore,
xfs_get_extsz_hint() will always return zero for non-rt dev/inodes,
which means that fallocate unshare will never enter
xfs_direct_write_iomap_begin().

I reviewed the code for xfs_direct_write_iomap_begin(), and there is
no handling of IOMAP_UNSHARE, just as xfs_buffered_write_iomap_begin()
has done. Perhaps this is the reason?
>
> Shouldn't it suffice to clamp offset/len in xfs_reflink_unshare?

Possible makes sense. However, as Christoph mentioned here[1] where I
did this in xfs_reflink_unshare(), we should consider the last block
if file size is not block aligned. IMO, it's more elegant to do it in
iomap_begin() callback...

If there's any misunderstanding or if I missed something, please let
me know, thanks!

[1]: https://lore.kernel.org/linux-xfs/Zu2FWuonuO97Q6V8@xxxxxxxxxxxxx/
>
> --D
>
> >       int                     error = 0;
> >       unsigned int            lockmode = XFS_ILOCK_EXCL;
> >       u64                     seq;
> > +     xfs_fileoff_t eof_fsb;
> >
> >       if (xfs_is_shutdown(mp))
> >               return -EIO;
> > @@ -1016,6 +1017,13 @@ xfs_buffered_write_iomap_begin(
> >       if (eof)
> >               imap.br_startoff = end_fsb; /* fake hole until the end */
> >
> > +     /* Don't try to unshare any blocks beyond EOF. */
> > +     eof_fsb = XFS_B_TO_FSB(mp, XFS_ISIZE(ip));
> > +     if (flags & IOMAP_UNSHARE && end_fsb > eof_fsb) {
> > +             xfs_trim_extent(&imap, offset_fsb, eof_fsb - offset_fsb);
> > +             end_fsb = eof_fsb;
> > +     }
> > +
> >       /* We never need to allocate blocks for zeroing or unsharing a hole. */
> >       if ((flags & (IOMAP_UNSHARE | IOMAP_ZERO)) &&
> >           imap.br_startoff > offset_fsb) {
> > @@ -1030,7 +1038,6 @@ xfs_buffered_write_iomap_begin(
> >        */
> >       if ((flags & IOMAP_ZERO) && imap.br_startoff <= offset_fsb &&
> >           isnullstartblock(imap.br_startblock)) {
> > -             xfs_fileoff_t eof_fsb = XFS_B_TO_FSB(mp, XFS_ISIZE(ip));
> >
> >               if (offset_fsb >= eof_fsb)
> >                       goto convert_delay;
> > --
> > 2.39.2
> >

Thanks,
-- 
Julian Sun <sunjunchao2870@xxxxxxxxx>