Boris Sukholitko <boris.sukholitko@xxxxxxxxxxxx> writes: > In testing of our kernel (based on 4.19, tainted, sorry!) on our aarch64 based hardware > we've come upon the following oops (lightly edited to omit irrelevant > details): I don't doubt you are seeing problems. I need to refresh my knowledge of that code to know for certain but I am not yet convinced this is the problem. I don't see any reason that the assertion your code makes would necessarily be a problem. Why can't a directory only have a single entry? I am not saying you are wrong. Just that I have not looked enough to verify there is an invariant being violated there. A very confusiong part of this situation is the fact he code has been stable for quite a while. I would have suspected someone's fuzzer to trigger problems on something other than aarch64. Especially if registering and unregistering a network device is enough to cause this. As that can be performed as non-root. Eric > The crash is in the call to count_subheaders(header->ctl_table_arg). > > Although the header (being in x19 == 0xffffffc01f0d6030) looks like a > normal kernel pointer, ctl_table_arg (x0 == 0x0000000000007a12) looks > invalid. > > Trying to find the issue, we've started tracing header allocation being > done by kzalloc in __register_sysctl_table and header freeing being done > in drop_sysctl_table. > > Then we've noticed headers being freed which where not allocated before. > The faulty freeing was done on parent->header at the end of > drop_sysctl_table. > > The invariant on __register_sysctl_table seems to be that non-empty > parent dir should have its header.nreg > 1. By failing this invariant > (see WARN_ON hunk in the patch) we've come upon the conclusion that > something is fishy with nreg counting. > > The root cause seems to be dropping new subdir in get_subdir function. > Here are the new subdir adventures leading to the invariant failure > above: > > 1. new subdir comes to being with nreg == 1 > 2. the nreg is being incremented in the found clause, nreg == 2 > 3. nreg is being decremented by the if (new) drop, nreg == 1 > 4. coming out of get_subdir, insert_header increments nreg: nreg == 2 > 5. nreg is being decremented at the end of __register_sysctl_table > > The fix seems to be avoiding step 3 if new subdir is the one being > returned. The patch does this and also adds warning for the nreg > invariant. > --- > fs/proc/proc_sysctl.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c > index b6f5d459b087..12fa5803dcb3 100644 > --- a/fs/proc/proc_sysctl.c > +++ b/fs/proc/proc_sysctl.c > @@ -1010,7 +1010,7 @@ static struct ctl_dir *get_subdir(struct ctl_dir *dir, > namelen, namelen, name, PTR_ERR(subdir)); > } > drop_sysctl_table(&dir->header); > - if (new) > + if (new && new != subdir) > drop_sysctl_table(&new->header); > spin_unlock(&sysctl_lock); > return subdir; > @@ -1334,6 +1334,7 @@ struct ctl_table_header *__register_sysctl_table( > goto fail_put_dir_locked; > > drop_sysctl_table(&dir->header); > + WARN_ON(dir->header.nreg < 2); > spin_unlock(&sysctl_lock); > > return header;