Hi,
On 26/10/2024 07:52, Saravana Kannan wrote:
In attempting to optimize fw_devlink runtime, I introduced numerous cycle
detection bugs by foregoing cycle detection logic under specific
conditions. Each fix has further narrowed the conditions for optimization.
It's time to give up on these optimization attempts and just run the cycle
detection logic every time fw_devlink tries to create a device link.
The specific bug report that triggered this fix involved a supplier fwnode
that never gets a device created for it. Instead, the supplier fwnode is
represented by the device that corresponds to an ancestor fwnode.
In this case, fw_devlink didn't do any cycle detection because the cycle
detection logic is only run when a device link is created between the
devices that correspond to the actual consumer and supplier fwnodes.
With this change, fw_devlink will run cycle detection logic even when
creating SYNC_STATE_ONLY proxy device links from a device that is an
ancestor of a consumer fwnode.
Reported-by: Tomi Valkeinen <tomi.valkeinen@xxxxxxxxxxxxxxxx>
Closes: https://lore.kernel.org/all/1a1ab663-d068-40fb-8c94-f0715403d276@xxxxxxxxxxxxxxxx/
Fixes: 6442d79d880c ("driver core: fw_devlink: Improve detection of overlapping cycles")
Signed-off-by: Saravana Kannan <saravanak@xxxxxxxxxx>
---
Greg,
I've tested this on my end and it looks ok and nothing fishy is going
on. You can pick this up once Tomi gives a Tested-by.
I tested this on TI AM62 SK board. It has an LVDS (OLDI) display and a
HDMI output, and both displays are connected to the same display
subsystem. I tested with OLDI single and dual link cases, with and
without HDMI, and in all cases probing works fine.
Looks good on that front, so:
Tested-by: Tomi Valkeinen <tomi.valkeinen@xxxxxxxxxxxxxxxx>
You also asked for a diff of the devlinks. That part doesn't look so
good to me, but probably you can tell if it's normal or not.
$ diff devlink-single-broken.txt devlink-single-fixed.txt
2d1
< i2c:1-0022--i2c:1-003b
11d9
<
platform:44043000.system-controller:clock-controller--platform:20010000.i2c
27d24
< platform:44043000.system-controller:clock-controller--platform:601000.gpio
42d38
<
platform:44043000.system-controller:power-controller--platform:20010000.i2c
58d53
< platform:44043000.system-controller:power-controller--platform:601000.gpio
74d68
< platform:4d000000.mailbox--platform:44043000.system-controller
76d69
< platform:601000.gpio--i2c:1-0022
80d72
< platform:bus@f0000:interrupt-controller@a00000--platform:601000.gpio
82d73
< platform:f4000.pinctrl--i2c:1-0022
84d74
< platform:f4000.pinctrl--platform:20010000.i2c
"i2c:1-003b" is the hdmi bridge, "i2c:1-0022" is a gpio expander. So,
for example, we lose the devlink between the gpio expander and the hdmi
bridge. The expander is used for interrupts. There's an interrupt line
from the HDMI bridge to the expander, and from there there's an
interrupt line going to the SoC.
Also, I noticed the devlinks change if I load the display drivers. The
above is before loading. Comparing the loaded/not-loaded:
$ diff devlink-dual-fixed.txt devlink-dual-fixed-loaded.txt
3d2
< i2c:1-003b--platform:30200000.dss
23d21
<
platform:44043000.system-controller:clock-controller--platform:30200000.dss
52d49
<
platform:44043000.system-controller:power-controller--platform:30200000.dss
73d69
< platform:display--platform:30200000.dss
78d73
< platform:f4000.pinctrl--platform:30200000.dss
97a93
> regulator:regulator.0--platform:display
Tomi
Thanks,
Saravana
v1 -> v2:
- Removed the RFC tag
- Remaned the subject. v1 is https://lore.kernel.org/all/20241025223721.184998-1-saravanak@xxxxxxxxxx/T/#u
- Added a NULL check to avoid NULL pointer deref
drivers/base/core.c | 46 ++++++++++++++++++++-------------------------
1 file changed, 20 insertions(+), 26 deletions(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 3b13fed1c3e3..f96f2e4c76b4 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1990,10 +1990,10 @@ static struct device *fwnode_get_next_parent_dev(const struct fwnode_handle *fwn
*
* Return true if one or more cycles were found. Otherwise, return false.
*/
-static bool __fw_devlink_relax_cycles(struct device *con,
+static bool __fw_devlink_relax_cycles(struct fwnode_handle *con_handle,
struct fwnode_handle *sup_handle)
{
- struct device *sup_dev = NULL, *par_dev = NULL;
+ struct device *sup_dev = NULL, *par_dev = NULL, *con_dev = NULL;
struct fwnode_link *link;
struct device_link *dev_link;
bool ret = false;
@@ -2010,22 +2010,22 @@ static bool __fw_devlink_relax_cycles(struct device *con,
sup_handle->flags |= FWNODE_FLAG_VISITED;
- sup_dev = get_dev_from_fwnode(sup_handle);
-
/* Termination condition. */
- if (sup_dev == con) {
+ if (sup_handle == con_handle) {
pr_debug("----- cycle: start -----\n");
ret = true;
goto out;
}
+ sup_dev = get_dev_from_fwnode(sup_handle);
+ con_dev = get_dev_from_fwnode(con_handle);
/*
* If sup_dev is bound to a driver and @con hasn't started binding to a
* driver, sup_dev can't be a consumer of @con. So, no need to check
* further.
*/
if (sup_dev && sup_dev->links.status == DL_DEV_DRIVER_BOUND &&
- con->links.status == DL_DEV_NO_DRIVER) {
+ con_dev && con_dev->links.status == DL_DEV_NO_DRIVER) {
ret = false;
goto out;
}
@@ -2034,7 +2034,7 @@ static bool __fw_devlink_relax_cycles(struct device *con,
if (link->flags & FWLINK_FLAG_IGNORE)
continue;
- if (__fw_devlink_relax_cycles(con, link->supplier)) {
+ if (__fw_devlink_relax_cycles(con_handle, link->supplier)) {
__fwnode_link_cycle(link);
ret = true;
}
@@ -2049,7 +2049,7 @@ static bool __fw_devlink_relax_cycles(struct device *con,
else
par_dev = fwnode_get_next_parent_dev(sup_handle);
- if (par_dev && __fw_devlink_relax_cycles(con, par_dev->fwnode)) {
+ if (par_dev && __fw_devlink_relax_cycles(con_handle, par_dev->fwnode)) {
pr_debug("%pfwf: cycle: child of %pfwf\n", sup_handle,
par_dev->fwnode);
ret = true;
@@ -2067,7 +2067,7 @@ static bool __fw_devlink_relax_cycles(struct device *con,
!(dev_link->flags & DL_FLAG_CYCLE))
continue;
- if (__fw_devlink_relax_cycles(con,
+ if (__fw_devlink_relax_cycles(con_handle,
dev_link->supplier->fwnode)) {
pr_debug("%pfwf: cycle: depends on %pfwf\n", sup_handle,
dev_link->supplier->fwnode);
@@ -2140,25 +2140,19 @@ static int fw_devlink_create_devlink(struct device *con,
return -EINVAL;
/*
- * SYNC_STATE_ONLY device links don't block probing and supports cycles.
- * So, one might expect that cycle detection isn't necessary for them.
- * However, if the device link was marked as SYNC_STATE_ONLY because
- * it's part of a cycle, then we still need to do cycle detection. This
- * is because the consumer and supplier might be part of multiple cycles
- * and we need to detect all those cycles.
+ * Don't try to optimize by not calling the cycle detection logic under
+ * certain conditions. There's always some corner case that won't get
+ * detected.
*/
- if (!device_link_flag_is_sync_state_only(flags) ||
- flags & DL_FLAG_CYCLE) {
- device_links_write_lock();
- if (__fw_devlink_relax_cycles(con, sup_handle)) {
- __fwnode_link_cycle(link);
- flags = fw_devlink_get_flags(link->flags);
- pr_debug("----- cycle: end -----\n");
- dev_info(con, "Fixed dependency cycle(s) with %pfwf\n",
- sup_handle);
- }
- device_links_write_unlock();
+ device_links_write_lock();
+ if (__fw_devlink_relax_cycles(link->consumer, sup_handle)) {
+ __fwnode_link_cycle(link);
+ flags = fw_devlink_get_flags(link->flags);
+ pr_debug("----- cycle: end -----\n");
+ pr_info("%pfwf: Fixed dependency cycle(s) with %pfwf\n",
+ link->consumer, sup_handle);
}
+ device_links_write_unlock();
if (sup_handle->flags & FWNODE_FLAG_NOT_DEVICE)
sup_dev = fwnode_get_next_parent_dev(sup_handle);