Hello NFSD maintainers,
I found that cat over /proc/fs/nfsd/pool_stats sysfs leads to kernel
crash on upstream kernel.
Console logs:
[221875.249341] Kernel attempted to read user page (0) - exploit
attempt? (uid: 0)
[221875.249347] BUG: Kernel NULL pointer dereference on read at 0x00000000
[221875.249351] Faulting instruction address: 0xc000000001071cd4
[221875.249356] Oops: Kernel access of bad area, sig: 11 [#1]
[221875.249360] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
[221875.249365] Modules linked in: binfmt_misc bonding tls rfkill
pseries_rng vmx_crypto drm fuse drm_panel_orientation_quirks xfs
libcrc32c sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp
pseries_wdt dm_mirror dm_region_hash dm_log dm_mod
[221875.249408] CPU: 9 PID: 98011 Comm: cat Kdump: loaded Not tainted
6.10.0-rc3nfsd-module-dirty #6
[221875.249419] Hardware name: SNIP...
[221875.249433] NIP: c000000001071cd4 LR: c000000001071cc8 CTR:
c000000000fe5394
[221875.249443] REGS: c0000000553ff890 TRAP: 0300 Not tainted
(6.10.0-rc3nfsd-module-dirty)
[221875.249453] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR:
22222402 XER: 20040000
[221875.249463] CFAR: c00000000106e598 DAR: 0000000000000000 DSISR:
40000000 IRQMASK: 0
[221875.249463] GPR00: c000000001071cc8 c0000000553ffb30
c000000001a96300 0000000000000000
[221875.249463] GPR04: c00000000d006130 0000000000000001
0000000000000001 00000003fa6f0000
[221875.249463] GPR08: 00000003fa6f0000 0000000000000000
c000000094c40000 0000000022222402
[221875.249463] GPR12: c000000000fe5394 c0000003ffff4c80
0000000000000000 0000000000000000
[221875.249463] GPR16: 0000000000000000 0000000000000000
0000000000000000 0000000000000000
[221875.249463] GPR20: 0000000000000000 0000000000000000
c00000000d006140 0000000000000000
[221875.249463] GPR24: 0000000000000000 0000000000000000
c00000000d006130 0000000000000000
[221875.249463] GPR28: c0000000553ffca0 c0000000553ffcc8
c00000005521e988 0000000000000000
[221875.249527] NIP [c000000001071cd4] mutex_lock+0x34/0x88
[221875.249538] LR [c000000001071cc8] mutex_lock+0x28/0x88
[221875.249546] Call Trace:
[221875.249551] [c0000000553ffb30] [c000000001071cc8]
mutex_lock+0x28/0x88 (unreliable)
[221875.249562] [c0000000553ffb60] [c000000000fe53c8]
svc_pool_stats_start+0x34/0xa8
[221875.249575] [c0000000553ffb90] [c0000000006329f0]
seq_read_iter+0x148/0x69c
[221875.249587] [c0000000553ffc70] [c000000000633048] seq_read+0x104/0x15c
[221875.249596] [c0000000553ffd10] [c0000000005e935c] vfs_read+0xdc/0x3a0
[221875.249608] [c0000000553ffdc0] [c0000000005ea41c] ksys_read+0x84/0x144
[221875.249615] [c0000000553ffe10] [c000000000030ae4]
system_call_exception+0x124/0x330
[221875.249621] [c0000000553ffe50] [c00000000000cedc]
system_call_vectored_common+0x15c/0x2ec
[221875.249632] --- interrupt: 3000 at 0x7fff85133cf4
[221875.249638] NIP: 00007fff85133cf4 LR: 00007fff85133cf4 CTR:
0000000000000000
[221875.249654] REGS: c0000000553ffe80 TRAP: 3000 Not tainted
(6.10.0-rc3nfsd-module-dirty)
[221875.249660] MSR: 800000000280f033
<SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 42222408 XER: 00000000
[221875.249674] IRQMASK: 0
[221875.249674] GPR00: 0000000000000003 00007ffff9a085f0
0000000114f87f00 0000000000000003
[221875.249674] GPR04: 00007fff852b0000 0000000000020000
0000000000000022 0000000000000000
[221875.249674] GPR08: 0000000000000000 0000000000000000
0000000000000000 0000000000000000
[221875.249674] GPR12: 0000000000000000 00007fff8548a5c0
0000000000000000 0000000000000000
[221875.249674] GPR16: 0000000020000000 0000000000000000
0000000000020000 0000000000000000
[221875.249674] GPR20: 00007ffff9a08ca8 0000000000000002
0000000000000000 0000000114f61d80
[221875.249674] GPR24: 0000000114f80110 0000000000020000
0000000114f67f60 000000007ff00000
[221875.249674] GPR28: 0000000000000003 00007fff852b0000
0000000000020000 00007fff852b0000
[221875.249756] NIP [00007fff85133cf4] 0x7fff85133cf4
[221875.249762] LR [00007fff85133cf4] 0x7fff85133cf4
[221875.249770] --- interrupt: 3000
[221875.249777] Code: 38424660 7c0802a6 60000000 7c0802a6 fbe1fff8
7c7f1b78 f8010010 f821ffd1 4bffc88d 60000000 39200000 e94d0908
<7d00f8a8> 7c284800 40c20010 7d40f9ad
[221875.249793] ---[ end trace 0000000000000000 ]---
[221875.252258] pstore: backend (nvram) writing error (-1)
[221875.252264]
The above logs are from the PowerPC architecture, but I think the issue
is reproducible on x86 as well.
The git bisect points to the following commit as the first bad commit:
commit 7b207ccd983350a5dedd132b57c666186dd02a7c
Author: NeilBrown <neilb@xxxxxxx>
Date: Fri Dec 15 11:56:32 2023 +1100
svc: don't hold reference for poolstats, only mutex.
A future patch will remove refcounting on svc_serv as it is of little
use.
It is currently used to keep the svc around while the pool_stats
file is
open.
I investigated the issue and found that in svc_pool_stats_start(),
si->mutex is NULL. Upon further investigation,
I found that while assigning private data to the seq variable (of type
struct seq_file) in the svc_pool_stats_open()
function, the info->mutex is NULL.
Although nfsd_create_serv() does initialize the mutex, it seems it is
not called.
It would be great if you could look into this issue. Feel free to let me
know if you need more information about this issue.
Thanks,
Sourabh Jain