On Wed, 2020-06-03 at 16:30 -0300, Jason Gunthorpe wrote: > On Wed, Jun 03, 2020 at 12:02:08PM -0700, James Bottomley wrote: > > On Wed, 2020-06-03 at 15:36 -0300, Jason Gunthorpe wrote: > > > On Wed, Jun 03, 2020 at 11:04:35AM -0700, James Bottomley wrote: > > > > On Tue, 2020-06-02 at 21:22 -0300, Jason Gunthorpe wrote: > > > > > On Tue, Jun 02, 2020 at 02:51:10PM -0700, James Bottomley > > > > > wrote: > > > > > > > > > > > My first thought was "what? I got suckered into creating a > > > > > > patch", thanks ;-) But now I look, all the error paths do > > > > > > unwind back to the initial state, so kfree() on error looks > > > > > > to > > > > > > be completely correct. > > > > > > > > > > It doesn't fully unwind if the kobject is put into a kset, > > > > > then > > > > > another thread can get the kref during kset_find_obj() and > > > > > kfree() won't wait for the kref to go to 0. It must use put. > > > > > > > > That does seem a bit contrived: the only failure > > > > kobject_add_internal() can get after kobj_kset_join() is from > > > > directory creation. If directory creation fails, no name > > > > appears > > > > in sysfs and no event for the name is sent, how did another > > > > thread > > > > get the name to pass in to kset_find_obj()? > > > > > > The other thread just guesses in a hostile way? > > > > > > Eg it looks like the iommu stuff just feeds in user data to > > > kobj_kset_join(). > > > > Well, if we have to go down the rabbit hole this far, it turns out > > to > > be fixable because of the state_in_sysfs flag: > > > > @@ -899,7 +903,8 @@ struct kobject *kset_find_obj(struct kset > > *kset, const char *name) > > spin_lock(&kset->list_lock); > > > > list_for_each_entry(k, &kset->list, entry) { > > - if (kobject_name(k) && !strcmp(kobject_name(k), > > name)) { > > + if (kobject_name(k) && k->state_in_sysfs && > > + !strcmp(kobject_name(k), name)) { > > ret = kobject_get_unless_zero(k); > > break; > > } > > > > That would ensure the name can't be found until the sysfs directory > > creation has succeeded, which would be the point from which > > kobject_init_and_add() can't fail. > > Convoluted, and needs something on the store of state_in_sysfs too, > but could work. The store of state_in_sysfs is already done in kobject_add_internal(). It's an existing flag people already use to tell if the kobject has been exposed in sysfs. However, it's set after the sysfs directory creation succeeds. This is the code with some debugging removed: error = create_dir(kobj); if (error) { kobj_kset_leave(kobj); kobject_put(parent); kobj->parent = NULL; ... } else kobj->state_in_sysfs = 1; return error; So it's the very last thing set before kobject_add_internal() returns success ... which is pretty much the last thing kobject_init_and_add() does. > It feels more robust to stick with the put though.. possibly ... like I said, the only concern with the put path is that ->release has state expectations that aren't met if kobject_init_and_add fails. I think none of the callers that currently does kfree() on error has a problem with this, but it should be checked. However, adding unwinding correctly keeps either kfree or put correct in the event of kobject_init_and_add() failure. James