Re: [PATCH v1 1/5] of/platform: Speed up of_find_device_by_node()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Saravana,

(I notice that I never seem to spell your name correctly.  Apologies for that,
both past and future).

This email was never answered.

-Frank


On 5/24/19 5:12 PM, Frank Rowand wrote:
> On 5/24/19 11:21 AM, Saravana Kannan wrote:
>> On Fri, May 24, 2019 at 10:56 AM Frank Rowand <frowand.list@xxxxxxxxx> wrote:
>>>
>>> Hi Sarvana,
>>>
>>> I'm not reviewing patches 1-5 in any detail, given my reply to patch 0.
>>>
>>> But I had already skimmed through this patch before I received the
>>> email for patch 0, so I want to make one generic comment below,
>>> to give some feedback as you continue thinking through possible
>>> implementations to solve the underlying problems.
>>
>> Appreciate the feedback Frank!
>>
>>>
>>>
>>> On 5/23/19 6:01 PM, Saravana Kannan wrote:
>>>> Add a pointer from device tree node to the device created from it.
>>>> This allows us to find the device corresponding to a device tree node
>>>> without having to loop through all the platform devices.
>>>>
>>>> However, fallback to looping through the platform devices to handle
>>>> any devices that might set their own of_node.
>>>>
>>>> Signed-off-by: Saravana Kannan <saravanak@xxxxxxxxxx>
>>>> ---
>>>>  drivers/of/platform.c | 20 +++++++++++++++++++-
>>>>  include/linux/of.h    |  3 +++
>>>>  2 files changed, 22 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/of/platform.c b/drivers/of/platform.c
>>>> index 04ad312fd85b..1115a8d80a33 100644
>>>> --- a/drivers/of/platform.c
>>>> +++ b/drivers/of/platform.c
>>>> @@ -42,6 +42,8 @@ static int of_dev_node_match(struct device *dev, void *data)
>>>>       return dev->of_node == data;
>>>>  }
>>>>
>>>> +static DEFINE_SPINLOCK(of_dev_lock);
>>>> +
>>>>  /**
>>>>   * of_find_device_by_node - Find the platform_device associated with a node
>>>>   * @np: Pointer to device tree node
>>>> @@ -55,7 +57,18 @@ struct platform_device *of_find_device_by_node(struct device_node *np)
>>>>  {
>>>>       struct device *dev;
>>>>
>>>> -     dev = bus_find_device(&platform_bus_type, NULL, np, of_dev_node_match);
>>>> +     /*
>>>> +      * Spinlock needed to make sure np->dev doesn't get freed between NULL
>>>> +      * check inside and kref count increment inside get_device(). This is
>>>> +      * achieved by grabbing the spinlock before setting np->dev = NULL in
>>>> +      * of_platform_device_destroy().
>>>> +      */
>>>> +     spin_lock(&of_dev_lock);
>>>> +     dev = get_device(np->dev);
>>>> +     spin_unlock(&of_dev_lock);
>>>> +     if (!dev)
>>>> +             dev = bus_find_device(&platform_bus_type, NULL, np,
>>>> +                                   of_dev_node_match);
>>>>       return dev ? to_platform_device(dev) : NULL;
>>>>  }
>>>>  EXPORT_SYMBOL(of_find_device_by_node);
>>>> @@ -196,6 +209,7 @@ static struct platform_device *of_platform_device_create_pdata(
>>>>               platform_device_put(dev);
>>>>               goto err_clear_flag;
>>>>       }
>>>> +     np->dev = &dev->dev;
>>>>
>>>>       return dev;
>>>>
>>>> @@ -556,6 +570,10 @@ int of_platform_device_destroy(struct device *dev, void *data)
>>>>       if (of_node_check_flag(dev->of_node, OF_POPULATED_BUS))
>>>>               device_for_each_child(dev, NULL, of_platform_device_destroy);
>>>>
>>>> +     /* Spinlock is needed for of_find_device_by_node() to work */
>>>> +     spin_lock(&of_dev_lock);
>>>> +     dev->of_node->dev = NULL;
>>>> +     spin_unlock(&of_dev_lock);
>>>>       of_node_clear_flag(dev->of_node, OF_POPULATED);
>>>>       of_node_clear_flag(dev->of_node, OF_POPULATED_BUS);
>>>>
>>>> diff --git a/include/linux/of.h b/include/linux/of.h
>>>> index 0cf857012f11..f2b4912cbca1 100644
>>>> --- a/include/linux/of.h
>>>> +++ b/include/linux/of.h
>>>> @@ -48,6 +48,8 @@ struct property {
>>>>  struct of_irq_controller;
>>>>  #endif
>>>>
>>>> +struct device;
>>>> +
>>>>  struct device_node {
>>>>       const char *name;
>>>>       phandle phandle;
>>>> @@ -68,6 +70,7 @@ struct device_node {
>>>>       unsigned int unique_id;
>>>>       struct of_irq_controller *irq_trans;
>>>>  #endif
>>>> +     struct device *dev;             /* Device created from this node */
>>>
>>> We have actively been working on shrinking the size of struct device_node,
>>> as part of reducing the devicetree memory usage.  As such, we need strong
>>> justification for adding anything to this struct.  For example, proof that
>>> there is a performance problem that can only be solved by increasing the
>>> memory usage.
>>
>> I didn't mean for people to focus on the deferred probe optimization.
> 
> I was speaking specifically of the of_find_device_by_node() optimization.
> I did not chase any further back in the call chain to see how that would
> impact anything else.  My comments stand, whether this patch is meant
> to optimize deferred probe optimization or to optimize something else.
> 
> 
>> In reality that was just a added side benefit of this series. The main
>> problem to solve is that of suppliers having to know when all their
>> consumers are up and managing the resources actively, especially in a
>> system with loadable modules where we can't depend on the driver to
>> notify the supplier because the consumer driver module might not be
>> available or loaded until much later.
>>
>> Having said that, I'm not saying we should go around and waste space
>> willy-nilly. But, isn't the memory usage going to increase based on
>> the number of DT nodes present in DT? I'd think as the number of DT
>> nodes increase it's more likely for those devices have more memory? So
>> at least in this specific case I think adding the field is justified.
> 
> Struct device_node is the nodes of the in kernel devicetree data.  This
> patch adds a field to every single node of the devicetree.
> 
> The patch series is also adding a new property, of varying size, to
> some nodes.  This also results in additional memory usage by
> devicetree. 
> 
> Arguing that a more complex system is likely to have more memory is
> likely true, but beside the point.  Minimizing devicetree memory
> used on less complex systems is also one of our goals.
> 
> 
>> Also, right now the look up is O(n) complexity and if we are trying to
>> add device links to most of the devices, that whole process becomes
>> O(n^2). Having this field makes the look up a O(1) and the entire
>> linking process a O(n) process. I think the memory usage increase is
>> worth the efficiency improvement.
> 
> Hand waving about O(n) and O(1) and O(n2) is not sufficient.  We require
> actual measurements that show O(n2) (when the existing algorithm is such)
> is a performance problem and that the proposed change to the algorithm
> results in a specific change in the performance.
> 
> The devicetree maintainers have decided that memory use is important and
> to be minimized, and the burden of proof that performance is an issue
> lies on the submitter of a patch that improves performance at the cost
> of memory.
> 
> Most devicetree boot time cpu overhead only affects the boot event.
> Added memory use persists for the entire booted lifetime of the system.
> 
> That is not to say that we never increase memory use to improve boot
> performance.  We have done so when the measured performance issue and
> measured performance improvement justified the change.
> 
>>
>> And if people are still strongly against it, we could make this a config option.
>>
>> -Saravana
>>
> 
> 




[Index of Archives]     [Device Tree Compilter]     [Device Tree Spec]     [Linux Driver Backports]     [Video for Linux]     [Linux USB Devel]     [Linux PCI Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Yosemite Backpacking]


  Powered by Linux