In testing of our kernel (based on 4.19, tainted, sorry!) on our aarch64 based hardware we've come upon the following oops (lightly edited to omit irrelevant details): 000:50:01.133 Unable to handle kernel paging request at virtual address 0000000000007a12 000:50:02.209 Process brctl (pid: 14467, stack limit = 0x00000000bcf7a578) 000:50:02.209 CPU: 1 PID: 14467 Comm: brctl Tainted: P 4.19.122 #1 000:50:02.209 Hardware name: Broadcom-v8A (DT) 000:50:02.209 pstate: 60000005 (nZCv daif -PAN -UAO) 000:50:02.209 pc : unregister_sysctl_table+0x1c/0xa0 000:50:02.209 lr : unregister_net_sysctl_table+0xc/0x20 000:50:02.209 sp : ffffff800e5ab9e0 000:50:02.209 x29: ffffff800e5ab9e0 x28: ffffffc016439ec0 000:50:02.209 x27: 0000000000000000 x26: ffffff8008804078 000:50:02.209 x25: ffffff80087b4dd8 x24: ffffffc015d65000 000:50:02.209 x23: ffffffc01f0d6010 x22: ffffffc01f0d6000 000:50:02.209 x21: ffffffc0166c4eb0 x20: 00000000000000bd 000:50:02.209 x19: ffffffc01f0d6030 x18: 0000000000000400 000:50:02.256 x17: 0000000000000000 x16: 0000000000000000 000:50:02.256 x15: 0000000000000400 x14: 0000000000000129 000:50:02.256 x13: 0000000000000001 x12: 0000000000000030 000:50:02.256 x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f 000:50:02.256 x9 : feff646663687161 x8 : ffffffffffffffff 000:50:02.256 x7 : fefefefefefefefe x6 : 0000000000008080 000:50:02.256 x5 : 00000000ffffffff x4 : ffffff8008905c38 000:50:02.256 x3 : ffffffc01f0d602c x2 : 00000000000000bd 000:50:02.256 x1 : ffffffc01f0d60c0 x0 : 0000000000007a12 000:50:02.256 Call trace: 000:50:02.256 unregister_sysctl_table+0x1c/0xa0 000:50:02.256 unregister_net_sysctl_table+0xc/0x20 000:50:02.256 __devinet_sysctl_unregister.isra.0+0x2c/0x60 000:50:02.256 inetdev_event+0x198/0x510 000:50:02.256 notifier_call_chain+0x58/0xa0 000:50:02.303 raw_notifier_call_chain+0x14/0x20 000:50:02.303 call_netdevice_notifiers_info+0x34/0x80 000:50:02.303 rollback_registered_many+0x384/0x600 000:50:02.303 unregister_netdevice_queue+0x8c/0x110 000:50:02.303 br_dev_delete+0x8c/0xa0 000:50:02.303 br_del_bridge+0x44/0x70 000:50:02.303 br_ioctl_deviceless_stub+0xcc/0x310 000:50:02.303 sock_ioctl+0x194/0x3f0 000:50:02.303 compat_sock_ioctl+0x678/0xc00 000:50:02.303 __arm64_compat_sys_ioctl+0xf0/0xcb0 000:50:02.303 el0_svc_common+0x70/0x170 000:50:02.303 el0_svc_compat_handler+0x1c/0x30 000:50:02.303 el0_svc_compat+0x8/0x18 000:50:02.303 Code: a90153f3 aa0003f3 f9401000 b40000c0 (f9400001) The crash is in the call to count_subheaders(header->ctl_table_arg). Although the header (being in x19 == 0xffffffc01f0d6030) looks like a normal kernel pointer, ctl_table_arg (x0 == 0x0000000000007a12) looks invalid. Trying to find the issue, we've started tracing header allocation being done by kzalloc in __register_sysctl_table and header freeing being done in drop_sysctl_table. Then we've noticed headers being freed which where not allocated before. The faulty freeing was done on parent->header at the end of drop_sysctl_table. The invariant on __register_sysctl_table seems to be that non-empty parent dir should have its header.nreg > 1. By failing this invariant (see WARN_ON hunk in the patch) we've come upon the conclusion that something is fishy with nreg counting. The root cause seems to be dropping new subdir in get_subdir function. Here are the new subdir adventures leading to the invariant failure above: 1. new subdir comes to being with nreg == 1 2. the nreg is being incremented in the found clause, nreg == 2 3. nreg is being decremented by the if (new) drop, nreg == 1 4. coming out of get_subdir, insert_header increments nreg: nreg == 2 5. nreg is being decremented at the end of __register_sysctl_table The fix seems to be avoiding step 3 if new subdir is the one being returned. The patch does this and also adds warning for the nreg invariant. --- fs/proc/proc_sysctl.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c index b6f5d459b087..12fa5803dcb3 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -1010,7 +1010,7 @@ static struct ctl_dir *get_subdir(struct ctl_dir *dir, namelen, namelen, name, PTR_ERR(subdir)); } drop_sysctl_table(&dir->header); - if (new) + if (new && new != subdir) drop_sysctl_table(&new->header); spin_unlock(&sysctl_lock); return subdir; @@ -1334,6 +1334,7 @@ struct ctl_table_header *__register_sysctl_table( goto fail_put_dir_locked; drop_sysctl_table(&dir->header); + WARN_ON(dir->header.nreg < 2); spin_unlock(&sysctl_lock); return header; -- 2.19.2