Re: Patch "cifs: Fix flushing, invalidation and file size with copy_file_range()" has been added to the 6.1-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

When testing the v6.1.69 kernel I bisected an issue to the below commit
which was added in v6.1.68. When running the xfstests[1] on cifs I
observe a null pointer dereference in cifs_flush_folio() because folio
is null and dereferenced in size = folio_size(folio).

[1] https://github.com/kdave/xfstests

This can be reproduced using many of the xfstests but reproduces with
generic/001 like below:

$ ./check -cifs generic/001

FSTYP         -- cifs
PLATFORM      -- Linux/x86_64 ip-172-31-36-150 6.1.69 #2 SMP
PREEMPT_DYNAMIC Wed Jan  3 22:36:43 UTC 2024
MKFS_OPTIONS  -- //172.31.34.66/scratch
MOUNT_OPTIONS -- -ousername=ec2-
user,password=abc123,noperm,mfsymlinks,cifsacl,actimeo=0 -o
context=system_u:object_r:root_t:s0 //172.31.34.66/scratch /mnt/scratch

generic/001 10s ...  

[  244.135555] run fstests generic/001 at 2024-01-03 22:44:59
[  244.637499] BUG: kernel NULL pointer dereference, address:
0000000000000000
[  244.638204] #PF: supervisor read access in kernel mode
[  244.638698] #PF: error_code(0x0000) - not-present page
[  244.639194] PGD 0 P4D 0 
[  244.639466] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  244.698047] CPU: 11 PID: 4123 Comm: cp Not tainted 6.1.69 #2
[  244.698598] Hardware name: Amazon EC2 m6i.4xlarge/, BIOS 1.0
10/16/2017
[  244.699228] RIP: 0010:cifs_flush_folio+0x3f/0x100 [cifs]
[  244.699782] Code: d2 41 54 89 cc 31 c9 55 53 48 89 f3 48 c1 ee 0c 48
83 ec 08 48 8b 7f 30 e8 7d 2c a6 f8 48 3d 00 f0 ff ff 0f 87 a5 00 00 00
<48> 8b 10 48 89 c5 b8 00 10 00 00 f7 c2 00 00 01 00 74 07 0f b6 4d
[  244.799263] RSP: 0018:ffffc25441ffbd20 EFLAGS: 00010207
[  244.888911] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
0000000000000000
[  244.898446] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
ffff9ed945eac000
[  244.899136] RBP: 00000000000003a1 R08: 0000000000000001 R09:
0000000000000000
[  244.997779] R10: 0000000000003e7f R11: 0000000000000000 R12:
ffffc25441ffbd90
[  244.998564] R13: ffffc25441ffbd88 R14: ffff9ed94af23850 R15:
0000000000000001
[  244.999264] FS:  00007f111404d340(0000) GS:ffff9ee7fecc0000(0000)
knlGS:0000000000000000
[  245.097782] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  245.098345] CR2: 0000000000000000 CR3: 00000001211fc001 CR4:
00000000007706e0
[  245.099122] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  245.099854] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[  245.198200] PKRU: 55555554
[  245.198478] Call Trace:
[  245.198735]  <TASK>
[  245.198967]  ? show_trace_log_lvl+0x1c4/0x2d2
[  245.199399]  ? show_trace_log_lvl+0x1c4/0x2d2
[  245.199830]  ? cifs_remap_file_range+0x15d/0x5e0 [cifs]
[  245.298029]  ? __die_body.cold+0x8/0xd
[  245.298402]  ? page_fault_oops+0xac/0x140
[  245.298943]  ? exc_page_fault+0x62/0x140
[  245.299336]  ? asm_exc_page_fault+0x22/0x30
[  245.299751]  ? cifs_flush_folio+0x3f/0x100 [cifs]
[  245.397858]  ? cifs_precopy_set_eof+0x73/0x110 [cifs]
[  245.398383]  cifs_remap_file_range+0x15d/0x5e0 [cifs]
[  245.398909]  do_clone_file_range+0xe7/0x230
[  245.399329]  vfs_clone_file_range+0x37/0x140
[  245.399861]  ioctl_file_clone+0x45/0xa0
[  245.497918]  do_vfs_ioctl+0x79/0x890
[  245.498328]  __x64_sys_ioctl+0x69/0xc0
[  245.498706]  do_syscall_64+0x38/0x90
[  245.499070]  entry_SYSCALL_64_after_hwframe+0x64/0xce
[  245.499568] RIP: 0033:0x7f1113e3ec6b
[  245.499929] Code: 73 01 c3 48 8b 0d 9500 f7 d8 64 89 01 48 83 c8 ff
c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05
<48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 65 a1 1b 00 f7 d8 64 89 01 48
[  245.599410] RSP: 002b:00007fff140ebb88 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[  245.697930] RAX: ffffffffffffffda RBX: 00007fff140ec3b0 RCX:
00007f1113e3ec6b
[  245.698620] RDX: 0000000000000003 RSI: 0000000040049409 RDI:
0000000000000004
[  245.699306] RBP: 0000000000000003 R08: 0000000000000001 R09:
0000000000000004
[  245.797942] R10: 0000000000001000 R11: 0000000000000246 R12:
00007fff140ed616
[  245.798789] R13: 00007fff140ebfa0 R14: 0000000000000000 R15:
0000000000000002
[  245.799485]  </TASK>
[  245.799720] Modules linked in: cmac nls_utf8 cifs cifs_arc4 cifs_md4
dns_lver mousedev sunrpc nls_ascii nls_cp437 vfat fat psmouse
ghash_clmulni_intel atkbd libps2 vivaldi_fmap aesni_intel ena i8042
crypto_simd serio cryptd button drm sch_fq_codel fuse i2c_core configfs
drm_panel_orientation_quirks loop backlight dmi_sysfs crc32_pclmul
intel dm_mirror dm_region_hash dm_log dm_mod dax efivarfs
[  245.998457] CR2: 0000000000000000
[  245.998800] ---[ end trace 0000000000000000 ]---
[  246.087916] RIP: 0010:cifs_flush_folio+0x3f/0x100 [cifs]
[  246.088794] Code: d2 41 54 49 89 cc 31 c9 55 53 48 89 f3 48 c1 ee 0c
48 83 ec 08 48 8b 7f 30 e8 7d 2c a6 f8 48 3d 00 f0 ff ff 0f 87 a5 00 00
00 <48> 8b 10 48 89 c5 b8 00 10 00 00 f7 c2 00 00 01 00 74 07 0f b6 4d
[  246.099436] RSP: 0018:ffffc25441ffbd20 EFLAGS: 00010207
[  246.189258] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
0000000000000000
[  246.198724] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
ffff9ed945eac000
[  246.199411] RBP: 00000000000003a1 R08: 0000000000000001 R09:
0000000000000000
[  246.298002] R10: 0000000000003e7f R11: 0000000000000000 R12:
ffffc25441ffbd90
[  246.298687] R13: ffffc25441ffbd88 R14: ffff9ed94af23850 R15:
0000000000000001
[  246.299372] FS:  00007f111404d340(0000) GS:ffff9ee7fecc0000(0000)
knlGS:0000000000000000
[  246.398006] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  246.398569] CR2: 0000000000000000 CR3: 00000001211fc001 CR4:
00000000007706e0
[  246.399254] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  246.399934] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[  246.498414] PKRU: 55555554
[  246.498698] Kernel panic - not syncing: Fatal exception
[  246.500042] Kernel Offset: 0x38000000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[  246.698094] Rebooting in 5 seconds..

Any ideas what is causing this and what the resolution is? This doesn't
occur on the upstream/master kernel.

Thanks,
Suraj

On Mon, 2023-12-11 at 13:57 +0100, gregkh@xxxxxxxxxxxxxxxxxxx wrote:
> 
> This is a note to let you know that I've just added the patch titled
> 
>     cifs: Fix flushing, invalidation and file size with
> copy_file_range()
> 
> to the 6.1-stable tree which can be found at:
>    
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
> 
> The filename of the patch is:
>      cifs-fix-flushing-invalidation-and-file-size-with-
> copy_file_range.patch
> and it can be found in the queue-6.1 subdirectory.
> 
> If you, or anyone else, feels it should not be added to the stable
> tree,
> please let <stable@xxxxxxxxxxxxxxx> know about it.
> 
> 
> > From 7b2404a886f8b91250c31855d287e632123e1746 Mon Sep 17 00:00:00
> > 2001
> From: David Howells <dhowells@xxxxxxxxxx>
> Date: Fri, 1 Dec 2023 00:22:00 +0000
> Subject: cifs: Fix flushing, invalidation and file size with
> copy_file_range()
> 
> From: David Howells <dhowells@xxxxxxxxxx>
> 
> commit 7b2404a886f8b91250c31855d287e632123e1746 upstream.
> 
> Fix a number of issues in the cifs filesystem implementation of the
> copy_file_range() syscall in cifs_file_copychunk_range().
> 
> Firstly, the invalidation of the destination range is handled
> incorrectly:
> We shouldn't just invalidate the whole file as dirty data in the file
> may
> get lost and we can't just call truncate_inode_pages_range() to
> invalidate
> the destination range as that will erase parts of a partial folio at
> each
> end whilst invalidating and discarding all the folios in the middle. 
> We
> need to force all the folios covering the range to be reloaded, but
> we
> mustn't lose dirty data in them that's not in the destination range.
> 
> Further, we shouldn't simply round out the range to PAGE_SIZE at each
> end
> as cifs should move to support multipage folios.
> 
> Secondly, there's an issue whereby a write may have extended the file
> locally, but not have been written back yet.  This can leaves the
> local
> idea of the EOF at a later point than the server's EOF.  If a copy
> request
> is issued, this will fail on the server with STATUS_INVALID_VIEW_SIZE
> (which gets translated to -EIO locally) if the copy source extends
> past the
> server's EOF.
> 
> Fix this by:
> 
>  (0) Flush the source region (already done).  The flush does nothing
> and
>      the EOF isn't moved if the source region has no dirty data.
> 
>  (1) Move the EOF to the end of the source region if it isn't already
> at
>      least at this point.  If we can't do this, for instance if the
> server
>      doesn't support it, just flush the entire source file.
> 
>  (2) Find the folio (if present) at each end of the range, flushing
> it and
>      increasing the region-to-be-invalidated to cover those in their
>      entirety.
> 
>  (3) Fully discard all the folios covering the range as we want them
> to be
>      reloaded.
> 
>  (4) Then perform the copy.
> 
> Thirdly, set i_size after doing the copychunk_range operation as this
> value
> may be used by various things internally.  stat() hides the issue
> because
> setting ->time to 0 causes cifs_getatr() to revalidate the
> attributes.
> 
> These were causing the generic/075 xfstest to fail.
> 
> Fixes: 620d8745b35d ("Introduce cifs_copy_file_range()")
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: David Howells <dhowells@xxxxxxxxxx>
> cc: Paulo Alcantara <pc@xxxxxxxxxxxxx>
> cc: Shyam Prasad N <nspmangalore@xxxxxxxxx>
> cc: Rohith Surabattula <rohiths.msft@xxxxxxxxx>
> cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
> cc: Jeff Layton <jlayton@xxxxxxxxxx>
> cc: linux-cifs@xxxxxxxxxxxxxxx
> cc: linux-mm@xxxxxxxxx
> Signed-off-by: David Howells <dhowells@xxxxxxxxxx>
> Signed-off-by: Steve French <stfrench@xxxxxxxxxxxxx>
> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> ---
>  fs/smb/client/cifsfs.c |  102
> +++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 99 insertions(+), 3 deletions(-)
> 
> --- a/fs/smb/client/cifsfs.c
> +++ b/fs/smb/client/cifsfs.c
> @@ -1191,6 +1191,72 @@ const struct inode_operations cifs_symli
>         .listxattr = cifs_listxattr,
>  };
>  
> +/*
> + * Advance the EOF marker to after the source range.
> + */
> +static int cifs_precopy_set_eof(struct inode *src_inode, struct
> cifsInodeInfo *src_cifsi,
> +                               struct cifs_tcon *src_tcon,
> +                               unsigned int xid, loff_t src_end)
> +{
> +       struct cifsFileInfo *writeable_srcfile;
> +       int rc = -EINVAL;
> +
> +       writeable_srcfile = find_writable_file(src_cifsi,
> FIND_WR_FSUID_ONLY);
> +       if (writeable_srcfile) {
> +               if (src_tcon->ses->server->ops->set_file_size)
> +                       rc = src_tcon->ses->server->ops-
> >set_file_size(
> +                               xid, src_tcon, writeable_srcfile,
> +                               src_inode->i_size, true /* no need to
> set sparse */);
> +               else
> +                       rc = -ENOSYS;
> +               cifsFileInfo_put(writeable_srcfile);
> +               cifs_dbg(FYI, "SetFSize for copychunk rc = %d\n",
> rc);
> +       }
> +
> +       if (rc < 0)
> +               goto set_failed;
> +
> +       netfs_resize_file(&src_cifsi->netfs, src_end);
> +       fscache_resize_cookie(cifs_inode_cookie(src_inode), src_end);
> +       return 0;
> +
> +set_failed:
> +       return filemap_write_and_wait(src_inode->i_mapping);
> +}
> +
> +/*
> + * Flush out either the folio that overlaps the beginning of a range
> in which
> + * pos resides or the folio that overlaps the end of a range unless
> that folio
> + * is entirely within the range we're going to invalidate.  We
> extend the flush
> + * bounds to encompass the folio.
> + */
> +static int cifs_flush_folio(struct inode *inode, loff_t pos, loff_t
> *_fstart, loff_t *_fend,
> +                           bool first)
> +{
> +       struct folio *folio;
> +       unsigned long long fpos, fend;
> +       pgoff_t index = pos / PAGE_SIZE;
> +       size_t size;
> +       int rc = 0;
> +
> +       folio = filemap_get_folio(inode->i_mapping, index);
> +       if (IS_ERR(folio))
> +               return 0;
> +
> +       size = folio_size(folio);
> +       fpos = folio_pos(folio);
> +       fend = fpos + size - 1;
> +       *_fstart = min_t(unsigned long long, *_fstart, fpos);
> +       *_fend   = max_t(unsigned long long, *_fend, fend);
> +       if ((first && pos == fpos) || (!first && pos == fend))
> +               goto out;
> +
> +       rc = filemap_write_and_wait_range(inode->i_mapping, fpos,
> fend);
> +out:
> +       folio_put(folio);
> +       return rc;
> +}
> +
>  static loff_t cifs_remap_file_range(struct file *src_file, loff_t
> off,
>                 struct file *dst_file, loff_t destoff, loff_t len,
>                 unsigned int remap_flags)
> @@ -1260,10 +1326,12 @@ ssize_t cifs_file_copychunk_range(unsign
>  {
>         struct inode *src_inode = file_inode(src_file);
>         struct inode *target_inode = file_inode(dst_file);
> +       struct cifsInodeInfo *src_cifsi = CIFS_I(src_inode);
>         struct cifsFileInfo *smb_file_src;
>         struct cifsFileInfo *smb_file_target;
>         struct cifs_tcon *src_tcon;
>         struct cifs_tcon *target_tcon;
> +       unsigned long long destend, fstart, fend;
>         ssize_t rc;
>  
>         cifs_dbg(FYI, "copychunk range\n");
> @@ -1303,13 +1371,41 @@ ssize_t cifs_file_copychunk_range(unsign
>         if (rc)
>                 goto unlock;
>  
> -       /* should we flush first and last page first */
> -       truncate_inode_pages(&target_inode->i_data, 0);
> +       /* The server-side copy will fail if the source crosses the
> EOF marker.
> +        * Advance the EOF marker after the flush above to the end of
> the range
> +        * if it's short of that.
> +        */
> +       if (src_cifsi->server_eof < off + len) {
> +               rc = cifs_precopy_set_eof(src_inode, src_cifsi,
> src_tcon, xid, off + len);
> +               if (rc < 0)
> +                       goto unlock;
> +       }
> +
> +       destend = destoff + len - 1;
> +
> +       /* Flush the folios at either end of the destination range to
> prevent
> +        * accidental loss of dirty data outside of the range.
> +        */
> +       fstart = destoff;
> +       fend = destend;
> +
> +       rc = cifs_flush_folio(target_inode, destoff, &fstart, &fend,
> true);
> +       if (rc)
> +               goto unlock;
> +       rc = cifs_flush_folio(target_inode, destend, &fstart, &fend,
> false);
> +       if (rc)
> +               goto unlock;
> +
> +       /* Discard all the folios that overlap the destination
> region. */
> +       truncate_inode_pages_range(&target_inode->i_data, fstart,
> fend);
>  
>         rc = file_modified(dst_file);
> -       if (!rc)
> +       if (!rc) {
>                 rc = target_tcon->ses->server->ops-
> >copychunk_range(xid,
>                         smb_file_src, smb_file_target, off, len,
> destoff);
> +               if (rc > 0 && destoff + rc >
> i_size_read(target_inode))
> +                       truncate_setsize(target_inode, destoff + rc);
> +       }
>  
>         file_accessed(src_file);
>  
> 
> 
> Patches currently in stable-queue which might be from
> dhowells@xxxxxxxxxx are
> 
> queue-6.1/cifs-fix-flushing-invalidation-and-file-size-with-
> copy_file_range.patch
> queue-6.1/cifs-fix-flushing-invalidation-and-file-size-with-
> ficlone.patch
> queue-6.1/cifs-fix-non-availability-of-dedup-breaking-generic-
> 304.patch
> 





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux