On Wed, Feb 26, 2020 at 07:18:44PM +0800, Yufen Yu wrote: > Hi, all > > We have reported a use-after-free crash for bdi device in > __blkg_prfill_rwstat() (see Patch #3). The bug is caused by printing > device kobj->name while the device and kobj->name has been freed by > bdi_unregister(). How does that happen? Who has access to a kobject without also having the reference count incremented at the same time? Is this through sysfs or somewhere within the kernel itself? > In fact, commit 68f23b8906 "memcg: fix a crash in wb_workfn when > a device disappears" has tried to address the issue, but the code > is till somewhat racy after that commit. That commit is really odd, and I think is just papering over the real issue, which is shown in the changelog for that commit. A kobject can be unregistered, like bdi_unregister() does, even if there are active references for it. But someone needs to also go around and decrement those references in order for things to be properly freed. It feels like the use of struct device (and by virtue of that, struct kobject and really a kref) here is not being done correctly at all. The rule should be, "whenever you pass a pointer to a device off, the reference count is incremented". Somehow that is not happening here and RCU is not going to solve the issue really, it's only going to delay the problem from showing up until much later. > In this patchset, we try to protect device lifetime with RCU, avoiding > the device been freed when others used. The struct device refcount should be all that is needed, don't use RCU just to "delay freeing this object until some later time because someone else might have a pointer to id". That's ripe for disaster. > A way which maybe fix the problem is copy device name into special > memory (as discussed in [0]), but that is also need lock protect. Hah, all that is needed is the name here? That's sad. thanks, greg k-h