Appreciate you investing the effort in helping on this.
I will start to merge it now as it doesn't apply cleanly on my branch.
If I understand correctly your main HW access prevention mechanism during
the PCI prepare-rescan period is by bailing out on IOCTLs with the check
of power state == SNDRV_CTL_POWER_D0 or waiting when a user process closes
it's device file descriptor in patches 2 and 5. For command submission
prevention you use the freeze flag from patch 6.
If I haven't missed anything I don't see how those all protect when
new device is plugged while any of those operations are already in
flight. What prevents concurrent HW access from an IOCTL already running
and HW suspend and MMIO unampping in rescan_preapre which starts after
IOCTL began ?
Andrey
On 2021-03-24 6:00 a.m., Takashi Iwai wrote:
On Tue, 23 Mar 2021 19:25:53 +0100,
Andrey Grodzovsky wrote:
This will cover IOCTLs and any
mmapped accesses i guess. Interrupts we discussed above. What above any
possible background kernel work going on in dedicated threads or work
items ? Any pointers there what should be blocked and waited for ?
An alternative idea would be the analogy of the system suspend /
resume. That is, we forcibly suspend the devices at first somehow,
and also restricts the further accesses by some way. Then do remap,
But that the point I guess, how you block further accesses without those
big locks, during S3 i believe user mode gets suspended before the
driver and so you don't need to worry about concurrent IOCTLs when going
through suspend sequence
ALSA core still has some legacy card-level power management code,
which was introduced many years ago at the time we still managed the
power state via an extra ioctl (hence working individually from the
base PM code), and a few pieces are still effective for this kind of
purposes. Through a quick glance, a couple of places need band-aids,
but the rest should work.
A bit more difficult problem is the floating control API calls. The
get/put calls might be still in flight when we perform the PCI
rescan. This has to be filtered out additionally.
Below are a patch series I cooked quickly. Totally untested, just
checked the compilation. The first patch is a fix I'll merge in
anyway, while the rest are RFC.
thanks,
Takashi