Re: [PATCH RFC 1/6] NFSD: Fix NFSv4 SETATTR's handling of large file sizes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Jan 27, 2022, at 7:36 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> 
> On Thu, Jan 27, 2022 at 11:08:31AM -0500, Chuck Lever wrote:
>> iattr::ia_size is a loff_t. decode_fattr4() dumps a full u64 value
>> in there. If that value is larger than S64_MAX, then ia_size has
>> underflowed.
>> 
>> In this case the negative size is passed through to the VFS and
>> underlying filesystems. I've observed XFS behavior: it returns
>> EIO but still attempts to access past the end of the device.
> 
> What attempts to access beyond the end of the device? A file offset
> is not a disk offset, and the filesystem cannot allocate blocks for
> IO that are outside the device boundaries. So I don't understand how
> setting an inode size of >LLONGMAX can cause the filesystem to
> access blocks outside the range it can allocate and map IO to. If
> this falls through to trying to access data outside the range the
> filesystem is allowed to access then we've got a bug that needs to
> be fixed.
> 
> Can you please clarify the behaviour that is occurring here (stack
> traces demonstrating the IO path that leads to access past the end
> of device would be useful) so we can look into this further?

Reproducer:

I constructed a pynfs test that sends an NFSv4.0 SETATTR request
to set the file length to U64_MAX. That test was applied to pynfs
today.

git://git.linux-nfs.org/projects/bfields/pynfs.git

I will note that I tried this test against a tmpfs export as
well. No crash, but a subsequent GETATTR returned U64_MAX,
which is surprising. There's really no checking in that path
either.


Note below: md0 is the device where this filesystem resides.
It's a pair of 3TB PCIe NVMe cards striped together. Kernel at
the time on this system was 5.17-rc1 + a few NFSD patches.

Jan 26 16:01:26 klimt.1015granger.net rpc.mountd[972]: v4.0 client attached: 0x61bb6d4261eef9f6 from "192.168.1.67:53636"
Jan 26 16:01:26 klimt.1015granger.net kernel: ------------[ cut here ]------------
Jan 26 16:01:26 klimt.1015granger.net kernel: WARNING: CPU: 2 PID: 1009 at fs/iomap/iter.c:33 iomap_iter+0x1b5/0x272
Jan 26 16:01:26 klimt.1015granger.net kernel: Modules linked in: rfkill rpcrdma rdma_ucm ib_srpt ib_umad ib_isert ib_ipoib iscsi_target_mod ib_>
Jan 26 16:01:26 klimt.1015granger.net kernel: CPU: 2 PID: 1009 Comm: nfsd Not tainted 5.17.0-rc1-00165-g2785fad9b745 #3338
Jan 26 16:01:26 klimt.1015granger.net kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 3.3 10/28/2020
Jan 26 16:01:26 klimt.1015granger.net kernel: RIP: 0010:iomap_iter+0x1b5/0x272
Jan 26 16:01:26 klimt.1015granger.net kernel: Code: 8b 73 08 49 8b 04 24 4d 89 e9 4d 89 f0 48 8b 3b e8 c0 79 8e 00 85 c0 0f 88 c1 00 00 00 48 8>
Jan 26 16:01:26 klimt.1015granger.net kernel: RSP: 0018:ffffa65701ea3a80 EFLAGS: 00010213
Jan 26 16:01:26 klimt.1015granger.net kernel: RAX: 0000000000000000 RBX: ffffa65701ea3ad0 RCX: ffff8ada8cf0f840
Jan 26 16:01:26 klimt.1015granger.net kernel: RDX: ffffffffffffffff RSI: ffff8ada49411000 RDI: ffff8ada8cf0f840
Jan 26 16:01:26 klimt.1015granger.net kernel: RBP: ffffa65701ea3aa0 R08: ffff8ada4a56b000 R09: ffffa65701ea3b40
Jan 26 16:01:26 klimt.1015granger.net kernel: R10: ffffa65701ea39a0 R11: 000000000efc25f5 R12: ffffffffc06d6100
Jan 26 16:01:26 klimt.1015granger.net kernel: R13: ffffa65701ea3b40 R14: ffffa65701ea3af8 R15: ffff8ada4a56b000
Jan 26 16:01:26 klimt.1015granger.net kernel: FS:  0000000000000000(0000) GS:ffff8ae97fd00000(0000) knlGS:0000000000000000
Jan 26 16:01:26 klimt.1015granger.net kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 26 16:01:26 klimt.1015granger.net kernel: CR2: 00007ffff8126260 CR3: 00000001a16dc001 CR4: 00000000001706e0
Jan 26 16:01:26 klimt.1015granger.net kernel: Call Trace:
Jan 26 16:01:26 klimt.1015granger.net kernel:  <TASK>
Jan 26 16:01:26 klimt.1015granger.net kernel:  iomap_zero_range+0x6c/0x1a9
Jan 26 16:01:26 klimt.1015granger.net kernel:  ? __radix_tree_lookup+0x2f/0xac
Jan 26 16:01:26 klimt.1015granger.net kernel:  iomap_truncate_page+0x31/0x36
Jan 26 16:01:26 klimt.1015granger.net kernel:  xfs_truncate_page+0x39/0x3b [xfs]
Jan 26 16:01:26 klimt.1015granger.net kernel:  xfs_setattr_size+0x11a/0x306 [xfs]
Jan 26 16:01:26 klimt.1015granger.net kernel:  xfs_vn_setattr_size+0x4e/0x57 [xfs]
Jan 26 16:01:26 klimt.1015granger.net kernel:  xfs_vn_setattr+0x67/0xb1 [xfs]
Jan 26 16:01:26 klimt.1015granger.net kernel:  notify_change+0x2ac/0x3a2
Jan 26 16:01:26 klimt.1015granger.net kernel:  nfsd_setattr+0x200/0x268 [nfsd]
Jan 26 16:01:26 klimt.1015granger.net kernel:  ? nfsd_setattr+0x200/0x268 [nfsd]
Jan 26 16:01:26 klimt.1015granger.net kernel:  nfsd4_setattr+0xf1/0x130 [nfsd]
Jan 26 16:01:26 klimt.1015granger.net kernel:  nfsd4_proc_compound+0x337/0x474 [nfsd]
Jan 26 16:01:26 klimt.1015granger.net kernel:  nfsd_dispatch+0x1a9/0x260 [nfsd]
Jan 26 16:01:26 klimt.1015granger.net kernel:  svc_process_common+0x331/0x4bc [sunrpc]
Jan 26 16:01:26 klimt.1015granger.net kernel:  ? nfsd_svc+0x2f5/0x2f5 [nfsd]
Jan 26 16:01:26 klimt.1015granger.net kernel:  svc_process+0xc8/0xe7 [sunrpc]
Jan 26 16:01:26 klimt.1015granger.net kernel:  nfsd+0xdd/0x160 [nfsd]
Jan 26 16:01:26 klimt.1015granger.net kernel:  kthread+0xf7/0xff
Jan 26 16:01:26 klimt.1015granger.net kernel:  ? nfsd_shutdown_threads+0x65/0x65 [nfsd]
Jan 26 16:01:26 klimt.1015granger.net kernel:  ? kthread_complete_and_exit+0x20/0x20
Jan 26 16:01:26 klimt.1015granger.net kernel:  ret_from_fork+0x22/0x30
Jan 26 16:01:26 klimt.1015granger.net kernel:  </TASK>
Jan 26 16:01:26 klimt.1015granger.net kernel: ---[ end trace 0000000000000000 ]---
Jan 26 16:01:26 klimt.1015granger.net kernel: ------------[ cut here ]------------
Jan 26 16:01:26 klimt.1015granger.net kernel: WARNING: CPU: 2 PID: 1009 at fs/iomap/iter.c:35 iomap_iter+0x1ca/0x272
Jan 26 16:01:26 klimt.1015granger.net kernel: Modules linked in: rfkill rpcrdma rdma_ucm ib_srpt ib_umad ib_isert ib_ipoib iscsi_target_mod ib_>
Jan 26 16:01:26 klimt.1015granger.net kernel: CPU: 2 PID: 1009 Comm: nfsd Tainted: G        W         5.17.0-rc1-00165-g2785fad9b745 #3338
Jan 26 16:01:26 klimt.1015granger.net kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 3.3 10/28/2020
Jan 26 16:01:26 klimt.1015granger.net kernel: RIP: 0010:iomap_iter+0x1ca/0x272
Jan 26 16:01:26 klimt.1015granger.net kernel: Code: 85 c0 0f 88 c1 00 00 00 48 8b 43 30 48 8b 53 08 48 39 d0 7e 02 0f 0b 48 8b 4b 38 48 85 c9 7>
Jan 26 16:01:26 klimt.1015granger.net kernel: RSP: 0018:ffffa65701ea3a80 EFLAGS: 00010293
Jan 26 16:01:26 klimt.1015granger.net kernel: RAX: f8ada41a253c0000 RBX: ffffa65701ea3ad0 RCX: f8ada41a253c0000
Jan 26 16:01:26 klimt.1015granger.net kernel: RDX: ffffffffffffffff RSI: ffff8ada49411000 RDI: ffff8ada8cf0f840
Jan 26 16:01:26 klimt.1015granger.net kernel: RBP: ffffa65701ea3aa0 R08: ffff8ada4a56b000 R09: ffffa65701ea3b40
Jan 26 16:01:26 klimt.1015granger.net kernel: R10: ffffa65701ea39a0 R11: 000000000efc25f5 R12: ffffffffc06d6100
Jan 26 16:01:26 klimt.1015granger.net kernel: R13: ffffa65701ea3b40 R14: ffffa65701ea3af8 R15: ffff8ada4a56b000
Jan 26 16:01:26 klimt.1015granger.net kernel: FS:  0000000000000000(0000) GS:ffff8ae97fd00000(0000) knlGS:0000000000000000
Jan 26 16:01:26 klimt.1015granger.net kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 26 16:01:26 klimt.1015granger.net kernel: CR2: 00007ffff8126260 CR3: 00000001a16dc001 CR4: 00000000001706e0
Jan 26 16:01:26 klimt.1015granger.net kernel: Call Trace:
Jan 26 16:01:26 klimt.1015granger.net kernel:  <TASK>
Jan 26 16:01:26 klimt.1015granger.net kernel:  iomap_zero_range+0x6c/0x1a9
Jan 26 16:01:26 klimt.1015granger.net kernel:  ? __radix_tree_lookup+0x2f/0xac
Jan 26 16:01:26 klimt.1015granger.net kernel:  iomap_truncate_page+0x31/0x36
Jan 26 16:01:26 klimt.1015granger.net kernel:  xfs_truncate_page+0x39/0x3b [xfs]
Jan 26 16:01:26 klimt.1015granger.net kernel:  xfs_setattr_size+0x11a/0x306 [xfs]
Jan 26 16:01:26 klimt.1015granger.net kernel:  xfs_vn_setattr_size+0x4e/0x57 [xfs]
Jan 26 16:01:26 klimt.1015granger.net kernel:  xfs_vn_setattr+0x67/0xb1 [xfs]
Jan 26 16:01:26 klimt.1015granger.net kernel:  notify_change+0x2ac/0x3a2
Jan 26 16:01:26 klimt.1015granger.net kernel:  nfsd_setattr+0x200/0x268 [nfsd]
Jan 26 16:01:26 klimt.1015granger.net kernel:  ? nfsd_setattr+0x200/0x268 [nfsd]
Jan 26 16:01:26 klimt.1015granger.net kernel:  nfsd4_setattr+0xf1/0x130 [nfsd]
Jan 26 16:01:26 klimt.1015granger.net kernel:  nfsd4_proc_compound+0x337/0x474 [nfsd]
Jan 26 16:01:26 klimt.1015granger.net kernel:  nfsd_dispatch+0x1a9/0x260 [nfsd]
Jan 26 16:01:26 klimt.1015granger.net kernel:  svc_process_common+0x331/0x4bc [sunrpc]
Jan 26 16:01:26 klimt.1015granger.net kernel:  ? nfsd_svc+0x2f5/0x2f5 [nfsd]
Jan 26 16:01:26 klimt.1015granger.net kernel:  svc_process+0xc8/0xe7 [sunrpc]
Jan 26 16:01:26 klimt.1015granger.net kernel:  nfsd+0xdd/0x160 [nfsd]
Jan 26 16:01:26 klimt.1015granger.net kernel:  kthread+0xf7/0xff
Jan 26 16:01:26 klimt.1015granger.net kernel:  ? nfsd_shutdown_threads+0x65/0x65 [nfsd]
Jan 26 16:01:26 klimt.1015granger.net kernel:  ? kthread_complete_and_exit+0x20/0x20
Jan 26 16:01:26 klimt.1015granger.net kernel:  ret_from_fork+0x22/0x30
Jan 26 16:01:26 klimt.1015granger.net kernel:  </TASK>
Jan 26 16:01:26 klimt.1015granger.net kernel: ---[ end trace 0000000000000000 ]---
Jan 26 16:01:26 klimt.1015granger.net kernel: nfsd: attempt to access beyond end of device
                                              md0: rw=2048, want=19907765165852672, limit=12501942272


>> IOW it assumes the caller has already sanity-checked the value.
> 
> Every filesystem assumes that the iattr that is passed to ->setattr
> by notify_change() has been sanity checked and the parameters are
> within the valid VFS supported ranges, not just XFS. Perhaps this
> check should be in notify_change, not in the callers?

My (limited) understanding of the VFS code is that functions at
the notify_change() level expect that its callers will have
already sanitized the input -- those callers are largely the
system call routines. That's why I chose to address this in NFSD.

Maybe inode_newsize_ok() needs to check for negative size values?


--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux