On Fri, May 22, 2020 at 01:48:00PM -0500, Pierre-Louis Bossart wrote: > > > On 5/22/20 1:40 PM, Jason Gunthorpe wrote: > > On Fri, May 22, 2020 at 01:35:54PM -0500, Pierre-Louis Bossart wrote: > > > > > > > > > On 5/22/20 12:10 PM, Jason Gunthorpe wrote: > > > > On Fri, May 22, 2020 at 10:33:20AM -0500, Pierre-Louis Bossart wrote: > > > > > > > > > > Maybe not great, but at least it is consistent with all the lifetime > > > > > > models and the operation of the driver core. > > > > > > > > > > I agree your comments are valid ones, I just don't have a solution to be > > > > > fully compliant with these models and report failures of the driver probe > > > > > for a child device due to configuration issues (bad audio topology, etc). > > > > > > > > > > > > > My understanding is that errors on probe are explicitly not handled in the > > > > > driver core, see e.g. comments such as: > > > > > > > > Yes, but that doesn't really apply here... > > > > > /* > > > > > * Ignore errors returned by ->probe so that the next driver can try > > > > > * its luck. > > > > > */ > > > > > https://elixir.bootlin.com/linux/latest/source/drivers/base/dd.c#L636 > > > > > > > > > > If somehow we could request the error to be reported then probably we > > > > > wouldn't need this complete/wait_for_completion mechanism as a custom > > > > > notification. > > > > > > > > That is the same issue as the completion, a driver should not be > > > > making assumptions about ordering like this. For instance what if the > > > > current driver is in the initrd and the 2nd driver is in a module in > > > > the filesystem? It will not probe until the system boots more > > > > completely. > > > > > > > > This is all stuff that is supposed to work properly. > > > > > > > > > Not at the moment, no. there are no failures reported in dmesg, and > > > > > the user does not see any card created. This is a silent error. > > > > > > > > Creating a partial non-function card until all the parts are loaded > > > > seems like the right way to surface an error like this. > > > > > > > > Or don't break the driver up in this manner if all the parts are really > > > > required just for it to function - quite strange place to get into. > > > > > > This is not about having all the parts available - that's handled already > > > with deferred probe - but an error happening during card registration. In > > > that case the ALSA/ASoC core throws an error and we cannot report it back to > > > the parent. > > > > The whole point of the virtual bus stuff was to split up a > > multi-functional PCI device into parts. If all the parts are required > > to be working to make the device work, why are you using virtual bus > > here? > > It's the other way around: how does the core know that one part isn't > functional. > There is nothing in what we said that requires that all parts are fully > functional. All we stated is that when *one* part isn't fully functional we > know about it. Maybe if you can present some diagram or something, because I really can't understand why asoc is trying to do with virtual bus here. > > > > What happens if the user unplugs this sub driver once things start > > > > running? > > > > > > refcounting in the ALSA core prevents that from happening usually. > > > > So user triggered unplug of driver that attaches here just hangs > > forever? That isn't OK either. > > No, you'd get a 'module in use' error if I am not mistaken. You can disconnect drivers without unloading modules. It is a common misconception. You should never, ever, rely on module ref counting for anything more than keeping function pointers in memory. Jason