On Sat, Jul 20, 2024 at 07:17:55AM +0200, Greg KH wrote: > On Fri, Jul 19, 2024 at 11:55:13AM -0700, Keith Busch wrote: > > From: Keith Busch <kbusch@xxxxxxxxxx> > > > > Get a reference to the device's kobject while storing and showing device > > attributes so that the device is valid for the lifetime of the sysfs access. > > Without this, the device may be released and use-after-free will occur. > > > > This is an easy problem to recreate with pci switches. Basic topology on a my > > qemu test machine: > > > > -[0000:00]-+-00.0 Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller > > +-01.0-[01-04]----00.0-[02-04]--+-00.0-[03]-- > > \-01.0-[04]----00.0 Red Hat, Inc. Virtio block device > > > > Simultaneously remove devices 04:00.0 and 01:00.0 and you'll hit it: > > > > # echo 1 > /sys/bus/pci/devices/0000\:04\:00.0/remove & > > # echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/remove > > So you remove the parent before the child and also want to remove the > child at the same time? You are going to have bad problems here :) The example I provided is surely a user error, but it just demonstrates the issue. The parent device can be removed at any time without user action: hotplug and error handling take devices down automatically. And it's not just a problem when requesting to concurrently removing the child device; it's still a use-after-free from just accessing its attributes. > > @@ -2433,12 +2433,15 @@ static ssize_t dev_attr_show(struct kobject *kobj, struct attribute *attr, > > struct device *dev = kobj_to_dev(kobj); > > ssize_t ret = -EIO; > > > > + if (!kobject_get_unless_zero(kobj)) > > + return -ENXIO; > > We've been down this path before, and it doesn't end well from what I > recall. Attributes that when written to remove themselves need to call > the correct function to do so (look at how scsi does it). I think this > change will now break that functionality. Look in the email archives a > long time ago for more details, I can't recall them at the moment, > sorry. Thanks for the suggestion. I'll try to figure out what scsi does and see if that strategy can work here.