On Tue, 26 May 2020 15:15:26 +0200, Pierre-Louis Bossart wrote: > > > > On 5/24/20 1:35 AM, Greg KH wrote: > > On Sat, May 23, 2020 at 02:41:51PM -0500, Pierre-Louis Bossart wrote: > >> > >> > >> On 5/23/20 1:23 AM, Greg KH wrote: > >>> On Fri, May 22, 2020 at 09:29:57AM -0500, Pierre-Louis Bossart wrote: > >>>> This is not an hypothetical case, we've had this recurring problem when a > >>>> PCI device creates an audio card represented as a platform device. When the > >>>> card registration fails, typically due to configuration issues, the PCI > >>>> probe still completes. > >>> > >>> Then fix that problem there. The audio card should not be being created > >>> as a platform device, as that is not what it is. And even if it was, > >>> the probe should not complete, it should clean up after itself and error > >>> out. > >> > >> Did you mean 'the PCI probe should not complete and error out'? > > > > Yes. > > > >> If yes, that's yet another problem... During the PCI probe, we start a > >> workqueue and return success to avoid blocking everything. > > > > That's crazy. > > > >> And only 'later' do we actually create the card. So that's two levels > >> of probe that cannot report a failure. I didn't come up with this > >> design, IIRC this is due to audio-DRM dependencies and it's been used > >> for 10+ years. > > > > Then if the probe function fails, it needs to unwind everything itself > > and unregister the device with the PCI subsystem so that things work > > properly. If it does not do that today, that's a bug. > > > > What kind of crazy dependencies cause this type of "requirement"? > > I think it is related to the request_module("i915") in > snd_hdac_i915_init(), and possibly other firmware download. > > Adding Takashi for more details. Right, there are a few levels of complexity there. The HD-audio PCI controller driver, for example, is initialized in an async way with a work. It loads the firmware files with request_firmware_nowait() and also binds itself as a component master with the DRM graphics driver via component framework. Currently it has no way to unwind the PCI binding itself at the error path, though. In theory it should be possible to unregister the PCI from the driver itself in the work context, but it failed in the earlier experiments, hence the driver sets itself in a disabled state instead. Maybe worth to try again. But, to be noted, all belonging sub-devices aren't instantiated but deleted at the error path. Only the main PCI binding is kept in a disabled state just as a place holder until it's unbound explicitly. Takashi