I merged the patches, since it looks like my branch
is older I didn't have chip->init_failed member and
also seems there was a typo on ur side as there
is no flush_work_sync, only flush_work which waits
synchronously anyway.
Everything works fine when testing with maplyer running concurrently
to PCI rescan cycle.
Can you be more specific what are those get/put calls,
I am thinking about some waitqueue to wait on in rescan_prepare
after setting snd_hdac_bus_freeze,
on wakeup it checks that a counter dropped back to zero.
Not sure on which entity to hang this counter ?
Andrey
On 2021-03-24 11:06 a.m., Takashi Iwai wrote:
On Wed, 24 Mar 2021 15:53:33 +0100,
Andrey Grodzovsky wrote:
Appreciate you investing the effort in helping on this.
I will start to merge it now as it doesn't apply cleanly on my branch.
If I understand correctly your main HW access prevention mechanism during
the PCI prepare-rescan period is by bailing out on IOCTLs with the check
of power state == SNDRV_CTL_POWER_D0 or waiting when a user process closes
it's device file descriptor in patches 2 and 5. For command submission
prevention you use the freeze flag from patch 6.
If I haven't missed anything I don't see how those all protect when
new device is plugged while any of those operations are already in
flight. What prevents concurrent HW access from an IOCTL already
running
and HW suspend and MMIO unampping in rescan_preapre which starts after
IOCTL began ?
The call of snd_pcm_suspend_all() is the key. That puts all PCM
streams in the stopped and suspended state. And the codec devices as
well as the controller device will be put into the suspended state.
So, for the PCM side, this should be fine with it, I guess.
However, after sending the patches, I noticed that they won't suffice
for the pending control calls. For the control get/put callbacks that
have been pending before setting the power_state=D3hot, they would
still kick off the runtime PM and bad things may happen. Due to that,
the bus.frozen flag won't work reliably, I'm afraid.
So, in that part, we need the code to sync the execution of the
pending get/put calls in addition. Maybe refcounting the control
ioctls that are involved with get/put calls
(SNDRV_CTL_IOCTL_ELEM_READ, _WRITE, maybe with _TLV_READ, too) and
wait for those ioctls finishing before actually starting the codec
suspend procedure.
Takashi