On Sat, Dec 22, 2018 at 11:17 AM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > On Sat, Dec 22, 2018 at 10:12:21AM +0800, Yafang Shao wrote: > > The callers of insert_header() will drop sysctl table whatever > > insert_header() successes or fails. > > So we can't call drop_sysctl_table() in insert_header(). > > > > The call sites of insert_header() are all in fs/proc/proc_sysctl.c. > > Except that at least insert_links() does this: > > ... > core_parent->header.nreg++; > ... > err = insert_header(core_parent, links); > if (err) > kfree(links); > out: > drop_sysctl_table(&core_parent->header); > with that drop clearly paired with explicit increment of nreg. So your > patch appears to break at least that one. > > Looking at the rest of callers... __register_sysctl_table() demonstrates > the same pattern - explicit increment of nreg, then insert_header(), > then *unconditional* drop undoing that increment. > > The third and last caller (get_subdir()) appears to be different. > We do insert_header(), if it succeeds we bump nreg on new and > unconditionally drop a reference to dir, as well as new. > > Let's look at the callers of that sucker. > > /* Reference moved down the diretory tree get_subdir */ > dir->header.nreg++; > spin_unlock(&sysctl_lock); > > /* Find the directory for the ctl_table */ > for (name = path; name; name = nextname) { > int namelen; > nextname = strchr(name, '/'); > if (nextname) { > namelen = nextname - name; > nextname++; > } else { > namelen = strlen(name); > } > if (namelen == 0) > continue; > > dir = get_subdir(dir, name, namelen); > if (IS_ERR(dir)) > goto fail; > } > > spin_lock(&sysctl_lock); > if (insert_header(dir, header)) > goto fail_put_dir_locked; > > Aha... So we are traversing a tree and it smells like get_subdir() > expects dir to be pinned by the caller, drops that reference and > either fails or returns the next tree node pinned. > > _IF_ that matches the behaviour of get_subdir(), the code above > makes sense - > grab dir > for each non-empty pathname component > next = get_subdir(dir, component) > // dir got dropped > if get_subdir has failed > goto fail > // next got grabbed > dir = next > insert_header(dir, header) > drop dir > if insert_header was successful > return header > fail: > // all references grabbed by the above are dropped by now > > So let's look at get_subdir() from refcounting POV (ignoring the locking > for now): > subdir = find_subdir(dir, name, namelen); > if (!IS_ERR(subdir)) > goto found; > if (PTR_ERR(subdir) != -ENOENT) > goto failed; > yeccchhhhhhh.... What's wrong with if (subdir == ERR_PTR(-ENOENT))? > Anyway, find_subdir() appears to be refcounting-neutral. > new = new_dir(set, name, namelen); > OK, looks like we are creating a new object here. What's the value of .nreg > in it? new_dir() itself doesn't seem to set it, but the thing it calls at > the end (init_header()) does set it to 1. OK, so we'd got a reference to > new object, our counter being 1. On failure it appears to return NULL. > OK... > subdir = ERR_PTR(-ENOMEM); > if (!new) > goto failed; > right, so if allocation in new_dir() has failed, we are buggering off to the > same 'failed' label as on other exits. In failure case new_dir() is > refcounting-neutral, so we are in the same environment. > /* Was the subdir added while we dropped the lock? */ > subdir = find_subdir(dir, name, namelen); > if (!IS_ERR(subdir)) > goto found; > if (PTR_ERR(subdir) != -ENOENT) > goto failed; > Interesting... So we can get to 'failed' in some cases when new_dir() > has not failed. > /* Nope. Use the our freshly made directory entry. */ > err = insert_header(dir, &new->header); > subdir = ERR_PTR(err); > if (err) > goto failed; > Looks like it expects insert_header() to be refcounting-neutral in failure > case... > subdir = new; > found: > subdir->header.nreg++; > > OK, we can get here from 3 places: > * subdir found by lookup before we even tried to allocate new. > new is NULL, subdir has just gotten a reference grabbed. > * subdir found by re-lookup after new had been allocated. > new is non-NULL and we are holding one reference to it, subdir has > just gotten a reference grabbed and it's not equal to new. > * new got successfully inserted into dir; subdir is equal > to new and we'd just grabbed the second reference to it. > > Looks like in all those cases we have a reference grabbed on subdir > *and* a reference grabbed on new (if non-NULL). Covers all 3 cases.,, > > failed: > ... and now we rejoin the failure paths (a lousy label name, that - we > get here on success as well). > if (IS_ERR(subdir)) > whine > drop reference to dir (the one that caller had grabbed for us) > if new is not NULL > drop reference to new. > return subdir. > > OK, in success case we have ended up with the total effect > drop dir > grab subdir > Holds in all 3 cases above. On failure, we have dropped the reference > grabbed by new_dir (if any) and dropped dir. That matches the behaviour > guessed by the look of the caller, and it definitely expects > insert_header() to be refcounting-neutral on failures. > > And AFAICS, insert_header() is that. Before your patch, that is... > > The bottom line is, proposed behaviour change appears to be broken for all > callers. > > NAK. > > PS: I was going to add that get_subdir() was in bad need of comments, until > I actually looked around. And right in front of that function we have this: > /** > * get_subdir - find or create a subdir with the specified name. > * @dir: Directory to create the subdirectory in > * @name: The name of the subdirectory to find or create > * @namelen: The length of name > * > * Takes a directory with an elevated reference count so we know that > * if we drop the lock the directory will not go away. Upon success > * the reference is moved from @dir to the returned subdirectory. > * Upon error an error code is returned and the reference on @dir is > * simply dropped. > */ > > So it *IS* documented, description does match the behaviour and > I should've bloody well checked if it's there first. And verified > that it does match the code, of course, but that's generally > simpler than deriving the behaviour from code and trying to come > up with sane description of such. Many thanks for your detailed explanation! I undanstand it. Thanks Yafang