On Wed 07 Jul 22:06 CDT 2021, Peter Chen wrote: > On 21-07-07 14:03:19, Bjorn Andersson wrote: > > On Tue 06 Jul 20:57 CDT 2021, Peter Chen wrote: > > > > Allow me to reorder your two questions: > > > > > And why using a notifier need to concern core's deferral probe? > > > > The problem at hand calls for the core for somehow invoking > > dwc3_qcom_vbus_overrride_enable() with a pointer to dwc3_qcom passed. > > > > This means that dwc3-qcom somehow needs to inform the dwc3-core about > > this (and stash the pointer). And this can't be done until dwc3-core > > actually exist, which it won't until dwc3_probe() has completed > > successfully (or in particular allocated struct dwc). > > Maybe you misunderstood the notifier I meant previous, my pointer was > calling glue layer API directly. > > Role switch is from dwc3-core, when it occurs, it means structure dwc3 has > allocated successfully, you could call glue layer notifier at function > dwc3_usb_role_switch_set directly. > Some references of my idea [1] [2] > It's probably 5+ years since I ran into something using platform_data, had totally forgotten about it. Defining a dwc3_platdata to allow the glue drivers to pass a function pointer (and Wesley's bool) to the core driver sounds like a possible way out of this. > [1] Function ci_hdrc_msm_notify_event at ci_hdrc_msm_notify_event > [2] https://source.codeaurora.org/external/imx/linux-imx/tree/drivers/usb/dwc3/core.c?h=lf-5.10.y#n205 > > > > > > Why do you think we need to retry the parent's probe again? > > > > There's four options here: > > > > 0) Hope that of_platform_populate() always succeeds in invoking > > dwc3_probe() on the first attempt, so that it is available when > > of_find_device_by_node() is invoked in dwc3_qcom_probe() (and a few of > > the other platform's drivers). > > > > 1) Ensure that the operations performed by dwc3_probe() happens > > synchronously and return a failure to dwc3-qcom, which depending on how > > dwc3_probe() failed can propagate that failure - i.e. either probe defer > > or clean up its resources if the failure from dwc3-core is permanent. > > > > 2) Register the dwc3-core and then return from dwc3_qcom_probe() in some > > half-initialized state and through some yet to be invented notification > > mechanism continue the tail of dwc3_qcom_probe() once dwc3_probe() has > > finished. But I know of no such notification mechanism in place today > > and we can just register a callback, because of 1). > > Furthermore we'd leave dwc3-qcom half-initialized until the dwc3-core is > > done probing - which might never happen. > > > > 3) Make drvdata of all the platform drivers be some known struct that > > dwc3-core can retrieve and dereference - containing the optional > > callback to the role_switch callback. > > > > > > We've tried the option 0) for a few years now. Option 2) is a variant of > > what we have today, where we patch one problem up and hope that nothing > > unexpected happens until things has fully probed. We're doing 3) in > > various other places, but in my view it's abusing a void * and has to be > > kept synchronized between all the possible parent drivers. > > > > Left is 1), which will take some refactoring, but will leave the > > interaction between the two drivers in a state that's very easy to > > reason about. > > Function of_find_device_by_node() invoked at glue layer is usually successfully, Went spelunking in drivers/base again, and I think you're right. of_find_device_by_node() looks for devices on the platform_bus's klist of devices, so if of_platform_populate() ends up successfully getting through device_add() the we will find something. It might not have probed yet, but as long as we don't rely on that we should be good... > The dwc3_probe failure doesn't affect it, unless you enable auto-suspend, > and glue layer's runtime suspend routine depends on dwc3 core's runtime suspend > routine. Right, if we hit qcom_dwc3_resume_irq() before the core driver has probed it certainly looks like we're going to hit a NULL pointer. > Would you please describe more about dwc3-core probe failure causes > dwc3-qcom's probe has failed or in half-initialized state you said? > Bryan had a previous patch where the glue layer was notified about role switching (iirc) and as soon as we hit a probe deferal in the core driver we'd dereference some pointer in the glue layer. I don't find the patch right now, but I suspect it might have been caused by the same platform_get_drvdata() as we see in qcom_dwc3_resume_irq(). > > > > > I know there are some downstream code which using this way, I would > > > like to know the shortcoming for it. > > > > > > > The shortcoming of having dwc3_qcom_probe() invoke dwc3_probe() > > "manually" and then returning -EPROBE_DEFER if the dwc3-core's resources > > aren't yet available is that we're wasting some time tearing down the > > dwc3-qcom state and then re-initialize it next time this is attempted. > > Like above, would you explain more about it? > I could, but I guess if we use platform_data to pass the callbacks to the core it doesn't matter if the core driver probes synchronously or in a week (except if the glue hits qcom_dwc3_resume_irq(), but if that can happen it can be fixed separately)... Regards, Bjorn