On Thu, 5 May 2016, Jessica Yu wrote: > +++ Josh Poimboeuf [05/05/16 10:04 -0500]: > > On Thu, May 05, 2016 at 04:25:48PM +0200, Miroslav Benes wrote: > > > On Thu, 5 May 2016, Josh Poimboeuf wrote: > > > > > > > On Thu, May 05, 2016 at 10:28:12AM +0200, Miroslav Benes wrote: > > > > > I think it boils down to the following problem. > > > > > > > > > > 1. CONFIG_DEBUG_KOBJECT_RELEASE=y > > > > > > > > > > 2. we have dynamic kobjects, so there is a pointer in klp_patch to > > > struct > > > > > kobject > > > > > > > > > > 3. it is allocated during klp_init_patch() and all is fine > > > > > > > > > > 4. now we want to remove the patch module. It is disabled and > > > module_put() > > > > > is called. User calls rmmod on the module. > > > > > > > > > > 5. klp_unregister_patch() is called in __exit method. > > > > > > > > > > 6. klp_free_patch() is called. > > > > > > > > > > 7. kobject_put(patch->kobj) is called. > > > > > > > > > > ...now it gets interesting... > > > > > > > > > > 8. among others kobject_cleanup() is scheduled as a delayed work (this > > > is > > > > > important). > > > > > > > > > > 9. there is no completion, so kobject_put returns and the module goes > > > > > away. > > > > > > > > > > 10. someone calls patch enabled_store attribute (for example). They > > > can > > > > > because kobject_cleanup() has not been called yet. It is delayed > > > > > scheduled. > > > > > > > > > > ...crash... > > > > > > > > But what exactly causes the crash? In enabled_store() we can see that > > > > the patch isn't in the list, so we can return -EINVAL. > > > > > > Ok, bad example. Take enabled_show() instead. It could be fixed in the > > > same way, but I am not sure it is the right thing to do. It does not scale > > > because the problem is elsewhere. > > > > > > Anyway, it is (even if theoretically) there in my opinion and we > > > have two options. > > > > > > 1. We could forget about CONFIG_DEBUG_KOBJECT_RELEASE and all is ok > > > without completion and regardless of dynamic/static kobject allocation. > > > > > > 2. We introduce completion and we are ok even with > > > CONFIG_DEBUG_KOBJECT_RELEASE=y and again regardless of dynamic/static > > > kobject allocation. > > > > I would disagree with the statement that the dynamic kobject doesn't > > scale. We would just need a helper function to get from a kobject to > > its klp_patch. > > > > In fact, to me it seems like the right way to do it. It doesn't make > > sense for the code which creates the kobject to be different from the > > code which initializes it. It's slightly out of context, but > > kobject.txt does say: > > > > "Code which creates a kobject must, of course, initialize that object." > > > > I view the completion as a hack to compensate for the fact that we're > > abusing the kobject interface. And so it makes sense to me that > > CONFIG_DEBUG_KOBJECT_RELEASE would cause problems, because we're using > > kobjects in the wrong way. > > > > So in my view, the two options are: > > > > 1. Convert the kobject to dynamic as I described. > > > > 2. Change the klp_register() interface so that klp_patch gets allocated > > in livepatch code. > > > > I'd be curious to hear what others think. > > So, I think both of these solutions would enable us to get rid of > the completion. Let me try to summarize.. Whoa, thanks for a good summary. That's exactly what was needed :) > For solution #1, if we dynamically allocate the kobject, i.e. we have a > pointer now instead of having it embedded in the klp_patch struct, we no > longer need to worry if the corresponding klp_patch gets deallocated under > our nose. Since the kobject_cleanup function is delayed w/ > CONFIG_DEBUG_KOBJECT_RELEASE, it is possible to have sysfs entries that > refer to a klp_patch that no longer exists. Thus if any of the sysfs > functions get called, we would have to take care to ensure that the > klp_patch struct corresponding to the kobject in question actually still > exists. In this case, all sysfs functions would require an extra check to > make sure the matching klp_patch is still on the patches list and return an > error if it isn't found. The "pro" is that this change would be simple, the > "con" is that now kobjects are decoupled and managed completely separately > from the object (klp_patch) with which they are associated, which doesn't > feel 100% right. Yes, I've thought a lot about it during the night (I've even dreams about kobjects now) and this doesn't seem as a good solution. We would need to make sure that we have appropriate checks everytime we change something in the code. Moreover I don't like walking through the list of patches each time. It is not critical path, but it is not nice anyway. This is what I meant with scaling problem previously. I agree that while compaction makes all the problems go away, it is more like 'after a fashion' solution. Which leads me to... > For solution #2, we could have livepatch manage the (de)allocation of > klp_patch objects internally. Maybe in this scenario the caller would need > to request a klp_patch object be allocated and the caller would fill out the > returned klp_patch struct as appropriate. In this case, we would be able > leave the kobject embedded in the klp_patch struct (and dynamic kobjects > wouldn't be needed), as livepatch would now have control of both structures. I think this is the way to go. > Then during the patch module exit path, when kobject_put is called, the > klp_patch struct would be freed in its kobject's release function. We > wouldn't have to hold up rmmod, and delayed execution of kobject_cleanup > wouldn't break anything, because a klp_patch would then have the same > "lifespan" > as its corresponding kobject, and therefore it would be safe to invoke > enabled_store & co. up until kobject_cleanup is finally executed. We'd be > able to use container_of in this case as well. In addition, we wouldn't > have to force all sysfs functions to support an awkward edge-case (i.e. > checking if the corresponding klp_patch still exists). I think this > solution also matches better with the typical use-case of the kobject > release method, as described in kobject.txt (replacing 'my_object' with > klp_patch): > --- > (...) > This notification is done through a kobject's release() method. > Usually such a method has a form like: > > void my_object_release(struct kobject *kobj) > { > struct my_object *mine = container_of(kobj, struct my_object, kobj); > /* Perform any additional cleanup on this object, then... */ > kfree(mine); > } > --- > Apologies for the giant wall of text. In any case I feel like solution #2 > is actually closer in line with how kobjects are normally used, embedded in > the structures they refer to, which get deallocated once their refcount > hits 0. What do people think? I agree. Josh and jikos proposed this as well, so I think we have an agreement. I'm gonna prepare a patch independent of the consistency model as this is a separate issue. Thanks a lot, Miroslav -- To unsubscribe from this list: send the line "unsubscribe live-patching" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html