Hi, On Tue, 1 Mar 2022 at 11:42, Jinlong Mao <quic_jinlmao@xxxxxxxxxxx> wrote: > > On 2/28/2022 10:51 PM, Suzuki K Poulose wrote: > > Hi Jinlong > > On 28/02/2022 13:31, Mao Jinlong wrote: > > From: Mao Jinlong <jinlmao@xxxxxxxxxxxxxxxx> > > It is possible that when device probe, its child device is not > probed. Then it will fail when add sysfs connection for the device. > Make device defer probe when the child device is not probed. > > > Please could you a bit a more specific on the exact issue ? > I don't see a problem with probe deferral right now, with > coresight/next. > > For e.g., > > root@juno-server:~# lsmod > Module Size Used by > coresight 73728 0 > root@juno-server:~# ls /sys/bus/coresight/devices/ > root@juno-server:~# modprobe coresight-etm4x > root@juno-server:~# lsmod > Module Size Used by > coresight_etm4x 81920 0 > coresight 73728 1 coresight_etm4x > root@juno-server:~# ls /sys/bus/coresight/devices/ > etm0 etm1 > > -- Note etm2-etm5 doesn't appear -- > > root@juno-server:~# modprobe coresight-funnel > root@juno-server:~# lsmod > Module Size Used by > coresight_funnel 20480 0 > coresight_etm4x 81920 0 > coresight 73728 2 coresight_etm4x,coresight_funnel > root@juno-server:~# ls /sys/bus/coresight/devices/ > etm0 etm1 > > -- Still don't appear --- > > root@juno-server:~# modprobe coresight-replicator > root@juno-server:~# ls /sys/bus/coresight/devices/ > etm0 etm1 > root@juno-server:~# modprobe coresight-tmc > > -- At this stage, the devices automatically get probed and appear -- > root@juno-server:~# ls /sys/bus/coresight/devices/ > etm0 etm1 etm2 etm3 etm4 etm5 funnel0 funnel1 funnel2 tmc_etf0 tmc_etr0 > > > root@juno-server:~# lsmod > Module Size Used by > coresight_tmc 40960 0 > coresight_replicator 20480 0 > coresight_funnel 20480 0 > coresight_etm4x 81920 0 > coresight 73728 4 coresight_tmc,coresight_etm4x,coresight_replicator,coresight_funnel > > So, my question is, what is this patch trying to solve ? > > > Cheers > Suzuki > > Hi Suzuki, > > This issue happens when race condition happens. > The condition is that the device and its child_device's probe happens at the same time. > > For example: device0 and its child device device1. > Both of them are calling coresight_register function. device0 is calling coresight_fixup_device_conns. > device1 is waiting for device0 to release the coresight_mutex. Because device1's csdev node is allocated, > coresight_make_links will be called for device0. Then in coresight_add_sysfs_link, has_conns_grp is true > for device0, but has_conns_grp is false for device1 as has_conns_grp is set to true in coresight_create_conns_sysfs_group . > The probe of device0 will fail for at this condition. > > > struct coresight_device *coresight_register(struct coresight_desc *desc) > { > ......... > mutex_lock(&coresight_mutex); > > ret = coresight_create_conns_sysfs_group(csdev); > if (!ret) > ret = coresight_fixup_device_conns(csdev); > if (!ret) > ret = coresight_fixup_orphan_conns(csdev); > if (!ret && cti_assoc_ops && cti_assoc_ops->add) > cti_assoc_ops->add(csdev); > > mutex_unlock(&coresight_mutex); > > ......... > > } > > static int coresight_fixup_device_conns(struct coresight_device *csdev) > { > .......... > conn->child_dev = > coresight_find_csdev_by_fwnode(conn->child_fwnode); The issue appears to be a constraint hidden in the lower layers of the code. Would a better solution not be to alter the code here: if (conn->child_dev && conn->child_dev->has_conns_grp) { ... } else { csdev->orphan = true; } which would mean that the connection attempt would drop through to label the connection as an orphan, to be cleaned up by the child itself when it runs coresight_fixup_orphan_conns() Regards Mike > if (conn->child_dev) { > ret = coresight_make_links(csdev, conn, > > conn->child_dev); > > .......... > > } > > > int coresight_add_sysfs_link(struct coresight_sysfs_link *info) > { > ................ > if (!info->orig->has_conns_grp || !info->target->has_conns_grp) > return -EINVAL; > > > > The probe fail issue is reproduced with reboot stress test on our internal device. > > With the patch, the probe fail issue is not reproduced. > > Thanks > > Jinlong Mao > > > > Signed-off-by: Mao Jinlong <jinlmao@xxxxxxxxxxxxxxxx> > --- > drivers/hwtracing/coresight/coresight-sysfs.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/hwtracing/coresight/coresight-sysfs.c b/drivers/hwtracing/coresight/coresight-sysfs.c > index 34d2a2d31d00..7df9eb59bf2c 100644 > --- a/drivers/hwtracing/coresight/coresight-sysfs.c > +++ b/drivers/hwtracing/coresight/coresight-sysfs.c > @@ -73,8 +73,10 @@ int coresight_add_sysfs_link(struct coresight_sysfs_link *info) > if (!info->orig || !info->target || > !info->orig_name || !info->target_name) > return -EINVAL; > - if (!info->orig->has_conns_grp || !info->target->has_conns_grp) > + if (!info->orig->has_conns_grp) > return -EINVAL; > + if (!info->target->has_conns_grp) > + return -EPROBE_DEFER; > /* first link orig->target */ > ret = sysfs_add_link_to_group(&info->orig->dev.kobj, > > -- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK