Hi, On 5/5/21 11:17 AM, Andy Shevchenko wrote: > On Wed, May 5, 2021 at 12:07 PM Hans de Goede <hdegoede@xxxxxxxxxx> wrote: >> On 5/4/21 9:52 AM, Andy Shevchenko wrote: >>> On Monday, May 3, 2021, Hans de Goede <hdegoede@xxxxxxxxxx <mailto:hdegoede@xxxxxxxxxx>> wrote: > > ... > >>> + fwnode = device_get_next_child_node(kdev, fwnode); > >>> Who is dropping reference counting on fwnode ? >> >> We are dealing with ACPI fwnode-s here and those are not ref-counted, they >> are embedded inside a struct acpi_device and their lifetime is tied to >> that struct. They should probably still be ref-counted (with the count >> never dropping to 0) so that the generic fwnode functions behave the same >> anywhere but atm the ACPI nodes are not refcounted, see: acpi_get_next_subnode() >> in drivers/acpi/property.c which is the get_next_child_node() implementation >> for ACPI fwnode-s. > > Yes, ACPI currently is exceptional, but fwnode API is not. > If you may guarantee that this case won't ever be outside of ACPI Yes I can guarantee that currently this code (which is for the i915 driver only) only deals with ACPI fwnodes. > and > even though if ACPI won't ever gain a reference counting for fwnodes, > we can leave it as is. Would it not be better to add fake ref-counting to the ACPI fwnode next_child_node() op though. I believe just getting a reference on the return value there should work fine; and then all fwnode implementations would be consistent ? (note I did not check that the of and swnode code do return a reference but I would assume so). >>> I’m in the middle of a pile of fixes for fwnode refcounting when for_each_child or get_next_child is used. So, please double check you drop a reference. >> >> The kdoc comments on device_get_next_child_node() / fwnode_get_next_child_node() >> do not mention anything about these functions returning a reference. > > It's possible. I dunno if it had to be done earlier. Sakari? > >> So I think we need to first make up our mind here how we want this all to >> work and then fix the actual implementation and docs before fixing callers. > > We have already issues, so I prefer not to wait for a documentation > update, because for old kernels it will still be an issue. I wonder if we really have issues though, in practice fwnodes are generated from an devicetree or ACPI tables (or by platform codes adding swnodes) and then these pretty much stick around for ever. IOW the initial refcount of 1 is never dropped at least for of-nodes and ACPI nodes. I know there are some exceptions like device-tree overlays which I guess may also be dynamically removed again, but those exceptions are not widely used. And if we forget to drop a reference in the worst case we have a small non-re-occuring (so not growing) memleak. Where as if we start adding put() calls everywhere we may end up freeing things which are still in use; or dropping refcounts below 0 triggering WARNs in various places (IIRC). So it seems the cure is potentially worse then the disease in this case. So if you want to work on this, then IMHO it would be best to first make sure that all the fwnode implementations behave in the same way wrt ref-counting, before adding the missing put() calls in various places. And once the behavior is consistent then we can also document this properly making it easier for other people to do the right thing when using these functions. Regards, Hans