Re: [PATCH v3 01/14] cxl/region: Add region creation ABI

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 2, 2022 at 11:03 AM Ben Widawsky <ben.widawsky@xxxxxxxxx> wrote:
>
> On 22-02-02 11:00:04, Dan Williams wrote:
> > On Wed, Feb 2, 2022 at 10:48 AM Ben Widawsky <ben.widawsky@xxxxxxxxx> wrote:
> > >
> > > On 22-02-02 10:28:11, Ben Widawsky wrote:
> > > > On 22-02-02 10:26:06, Ben Widawsky wrote:
> > > > > On 22-01-28 10:59:26, Dan Williams wrote:
> > > > > > On Fri, Jan 28, 2022 at 10:14 AM Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
> > > > > > [..]
> > > > > > > Here is that put_device() I was expecting, that kfree() earlier was a
> > > > > > > double-free it seems.
> > > > > > >
> > > > > > > Also, I would have expected a devm action to remove this. Something like:
> > > > > > >
> > > > > > > struct cxl_port *port = to_cxl_port(cxld->dev.parent);
> > > > > > >
> > > > > > > cxl_device_lock(&port->dev);
> > > > > > > if (port->dev.driver)
> > > > > > >     devm_cxl_add_region(port->uport, cxld, id);
> > > > >
> > > > > I assume you mean devm_cxl_delete_region(), yes?
> > > > >
> > > > > > > else
> > > > > > >     rc = -ENXIO;
> > > > > > > cxl_device_unlock(&port->dev);
> > > > > > >
> > > > > > > ...then no matter what you know the region will be unregistered when
> > > > > > > the root port goes away.
> > > > > >
> > > > > > ...actually, the lock and ->dev.driver check here are not needed
> > > > > > because this attribute is only registered while the cxl_acpi driver is
> > > > > > bound. So, it is safe to assume this is protected as decoder remove
> > > > > > synchronizes against active sysfs users.
> > > > >
> > > > > I'm somewhat confused when you say devm action to remove this. The current auto
> > > > > region deletion happens when the ->release() is called. Are you suggesting when
> > > > > the root decoder is removed I delete the regions at that point?
> > > >
> > > > Hmm. I went back and looked and I had changed this functionality at some
> > > > point... So forget I said that, it isn't how it's working currently. But the
> > > > question remains, are you suggesting I delete in the root decoder
> > > > unregistration?
> > >
> > > I think it's easier if I write what I think you mean.... Here are the relevant
> > > parts:
> > >
> > > devm_cxl_region_delete() is removed entirely.
> > >
> > > static void unregister_region(void *_cxlr)
> > > {
> > >         struct cxl_region *cxlr = _cxlr;
> > >
> > >         device_unregister(&cxlr->dev);
> > > }
> > >
> > >
> > > static int devm_cxl_region_add(struct cxl_decoder *cxld, struct cxl_region *cxlr)
> > > {
> > >         struct cxl_port *port = to_cxl_port(cxld->dev.parent);
> > >         struct device *dev = &cxlr->dev;
> > >         int rc;
> > >
> > >         rc = dev_set_name(dev, "region%d.%d:%d", port->id, cxld->id, cxlr->id);
> > >         if (rc)
> > >                 return rc;
> > >
> > >         rc = device_add(dev);
> > >         if (rc)
> > >                 return rc;
> > >
> > >         return devm_add_action_or_reset(&cxld->dev, unregister_region, cxlr);
> >
> > Decoders can't host devm actions. The host for this action would need
> > to be the parent port.
>
> Happy to change it since I can't imagine a decoder would go down without the
> port also going down. Can you please explain why a decoder can't host a devm
> action though. I'd like to understand that better.

So, devm releases resources at 2 times one of which is "too late". The
natural / expected point at which they are released is by the driver
core at $driver->remove($dev) time. There is also a backstop release
point at $dev->${type,class,bus}->release() time. The latter one is
"too late" because it effectively leave the device registered
indefinitely which is broken because the parent sysfs directory
hierarchy for that device will have already been removed, so the late
release may crash depending on when the last put_device($dev) is
performed. The decoder never experiences a ->remove() event, but it's
parent port does (at least for non-root ports). For this case it will
likely need to reach further up and use the same devm host as the
decoder itself which is the ACPI0017 device.

>
> >
> > > }
> > >
> > > static ssize_t delete_region_store(struct device *dev,
> > >                                    struct device_attribute *attr,
> > >                                    const char *buf, size_t len)
> > > {
> > >         struct cxl_decoder *cxld = to_cxl_decoder(dev);
> > >         struct cxl_region *cxlr;
> > >
> > >         cxlr = cxl_find_region_by_name(cxld, buf);
> > >         if (IS_ERR(cxlr))
> > >                 return PTR_ERR(cxlr);
> > >
> > >         devm_release_action(dev, unregister_region, cxlr);
> >
> > Yes, modulo the same comment as before that the decoder object is not
> > a suitable devm host. This also needs a solution for the race between
> > these 2 actions:
> >
> > echo "ACPI0017:00" > /sys/bus/platform/drivers/cxl_acpi/unbind
> > echo $region > /sys/bus/cxl/devices/$decoder/delete_region
>
> Is there a better solution than taking the root port lock?

Depends what lockdep says. The first choice lock to synchronize those
2 actions would be the ACPI0017 device_lock, but lockdep might point
out a problem with that vs sysfs teardown synchronization.



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux