Hi Bryan,
What happens to this code if you
static int count;
1. sleep in dwc3_probe for 10 milliseconds
2. return -EPROBE_DEFER
3. if count++ < 5 goto 1
i.e. if we simulate say waiting on a PHY driver to probe in
dwc3_probe()
The vendor hooks are used in __dwc3_set_mode and role_switch_set
calls in core and drd files respectively. These are invoked only if
we are OTG capable. The drd_work is initialized in core_init_mode
which is called at the end of dwc3_probe. If dwc3_probe fails and
gets deferred before that, none of the vendor hooks will be fired and
dwc3_qcom_probe is also deferred.
However I see that if core_init_mode fails (the cleanup is already
done in drd to prevent set_role from getting invoked already), I
need to cleanup vendor hooks in error path of dwc3_probe().
and what happens if we introduce a 100 millsecond sleep into
dwc3_qcom_probe() - and run a fake disconnect event from
dwc3_qcom_probe_core() directly ?
In other words if make it that dwc3_probe() completes and struct
dwc3_glue_ops->notify_cable_disconnect() fires prior to
dwc3_qcom_probe_core() completing ?
i.e. I don't immediately see how you've solved the probe()
completion race condition here.
Just wanted to understand the situation clearly. Is this the sequence
you are referring to ?
1. dwc3_probe is successful and role switch is registered properly.
2. added delay after dwc3_qcom_probe_core and before interconnect_init
3. Between this delay, we got a disconnect notificiation from glink
4. We are clearing the qscratch reg in case of device mode and
un-registering notifier in case of host mode.
If so, firstly I don't see any issue if we process disconnect event
before qcom probe is complete. If we reached this stage, the
clocks/gdsc is definitely ON and register accesses are good to go.
If we are in host mode at this point, we would just unregister to
usb-core notifier and mark last busy. If we are in device mode, we
would just clear the hs_phy_ctrl reg of qscratch. After the 100ms
delay you mentioned we would call dwc3_remove anyways and cleanup the
vendor hooks. But is the concern here that, what if we enter
runtime_suspend at this point ?
Just to clarify one more thing. The probe completion requirement came
in because, before the device tree was flattened, dwc3-qcom and core
are two different platform devices. And if the dwc3 core device probe
got deferred, dwc3-qcom probe still gets successfully completed. The
glue would never know when to register vendor hook callbacks to
dwc3-core as it would never know when the core probe was completed.
That is the reason we wanted to find out accurate point where core
probe is done to ensure we can properly register these callbacks.
Are you saying to you require/rely on both of these series being applied
first ?
[1]:
https://lore.kernel.org/all/af60c05b-4a0f-51b8-486a-1fc601602515@xxxxxxxxxxx/
[2]:
https://lore.kernel.org/all/20231016-dwc3-refactor-v1-0-ab4a84165470@xxxxxxxxxxx/
Must be, nothing applies for me in this series.
The first one is not a patch. It is just a discussion thread I started
to get community's opinion before on disconnect interrupt handling. The
current series is based on top of [2] made by Bjorn (as you already
found out) and as I mentioned in cover letter of my series.
Regards,
Krishna,