On Thu, 18 Jun 2020, Mark Hills wrote: > On Thu, 18 Jun 2020, Takashi Iwai wrote: > > > On Thu, 18 Jun 2020 13:07:33 +0200, > > Mark Hills wrote: > > > > > > On Thu, 18 Jun 2020, Takashi Iwai wrote: > > > > > > > On Wed, 17 Jun 2020 12:51:05 +0200, > > > > Mark Hills wrote: > > > [...] > > > But could I please ask for help with the bigger picture? As it feels > > > mismatched. > > > > > > * Code should take every possible opportunity to update the stream > > > position; interrupts, or explicit pcm_pointer calls (whereas the docs > > > guide towards doing it in the interrupt handler) > > > > > > * I critiqued (elsewhere in thread) the older interrupt handler for > > > checking the counter, unlocking, calling back into PCM core and checking > > > again a moment later. Whereas this is considered good behaviour. > > > > > > * Seems like the overall aim is for userland to be able (if it wants to) > > > to poll the soundcard, even outside of periods. > > > > Right, the user-space can query the current position at any time, and > > the driver should return the position as precisely as possible. > > > > Some applications (like PulseAudio) sets the interrupt as minimum as > > possible while it does schedule the update by itself, judging the > > position via the ioctl. In such operational mode, the accuracy of the > > current position query is vital. > > > > > If all the above is true, I would expect interrupt handling to be very > > > simple -- it would straight away call into PCM core, join existing if the > > > codepaths (to take locks) and do a position update. PCM core would decide > > > if a period really elapsed, not the driver. But this is not how it works. > > > > > > This now relates strongly to a question of locking: > > > > > > I ran the code (top of this message) all day, with a few instances in > > > dmesg (at irregular intervals, not wrapping): > > > > > > [161644.076666] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 > > > [163232.392501] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 > > > [164976.098069] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 > > > [165054.946225] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 > > > [165312.141545] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 > > > > > > A definite bug, of course. > > > > > > However (and I am happy to be corrected) the function never finishes with > > > position == buffer size. Only way is a race between interrupt handler and > > > pcm_pointer. > > > > > > Therefore one of these is needed: > > > > > > * pcm_pointer locks chip->lock > > > > > > Even though the docs emphasise PCM core has exclusivity, it it not worth > > > much as it does not protect against the interrupt handler. > > > > > > But now interrupt handler becomes ugly in total: take chip->lock, check > > > the counter, release chip->lock, go to PCM core (which takes middle > > > layer locks), take chip->lock again, check counter again, release > > > chip->lock again. > > > > Yes, that's the most robust way to go. If the lock is really costing > > too much, you can consider a fast-path by some flag for the irq -> > > snd_pcm_period_elapsed() case. > > I don't understand how a fast path could be made to work, as it can't pass > data across snd_pcm_period_elapsed() and it still must syncronise access > between reading dma_counter and writing pipe->position. > > Hence questioning if a better design is simpler interrupt handlers that > just enter PCM core. > > > Basically you can do everything in the pointer callback. The only > > requirement in the interrupt handle is basically judge whether you need > > the call of snd_pcm_period_elapsed() and call it. The rest update could > > be done in the other places. > > Thanks, please just another clarification: > > I presume that calls to pcm_pointer are completely independent of the > period notifications? > > ie. period notifications are at regular intervals, regardless of whether > pcm_pointer is called inbetween. pcm_pointer must not reset any state used > by the interrupt. > > Which means we must handle when non-interrupt call to pcm_pointer causes a > period to elapse. The next interrupt handler must notify. > > I can see in the original code uses chip->last_period exclusively by the > interrupt handler, and I removed it. Some comments around the intent would > help. I'll cross reference the original code with my new understanding. > > My instinct here is that to preserve > > - regular period notifications > > - handle period_size not aligning with 32-bit counter > > - no races between interrupt and pcm_pointer > > that the clearest and bug-free implementation may be to separate the > interrupt (notifications) and pointer updates to separate state. > > Then there's no lock and the only crossover is an atomic read of > dma_counter. > > And that's what I will try -- thanks. Ok so the implementation would look something like this below, which I will run for the rest of the day: * Clear separation of the period notification from position updates; only syncronising is around dma_counter, no need for locks * All counting is accumulated to avoid bugs in the cases of wrapping and non-alignment It's easier to see in the end results but of course I'll work on a clear diff. --- /****************************************************************************** IRQ Handling ******************************************************************************/ /* Check if a period has elapsed since last interrupt * * Don't make any updates to state; PCM core handles this with the * correct locks. * * \return true if a period has elapsed, otherwise false */ static bool period_has_elapsed(struct snd_pcm_substream *substream) { struct snd_pcm_runtime *runtime = substream->runtime; struct audiopipe *pipe = runtime->private_data; u32 counter, step; size_t period_bytes; if (pipe->state != PIPE_STATE_STARTED) return false; period_bytes = frames_to_bytes(runtime, runtime->period_size); counter = le32_to_cpu(*pipe->dma_counter); /* presumed atomic */ step = counter - pipe->last_period; /* handles wrapping */ step -= step % period_bytes; /* acknowledge whole periods only */ if (step == 0) return false; /* haven't advanced a whole period yet */ pipe->last_period += step; /* used exclusively by us */ return true; } static irqreturn_t snd_echo_interrupt(int irq, void *dev_id) { struct echoaudio *chip = dev_id; int ss, st; spin_lock(&chip->lock); st = service_irq(chip); if (st < 0) { spin_unlock(&chip->lock); return IRQ_NONE; } /* The hardware doesn't tell us which substream caused the irq, thus we have to check all running substreams. */ for (ss = 0; ss < DSP_MAXPIPES; ss++) { struct snd_pcm_substream *substream; substream = chip->substream[ss]; if (substream && period_has_elapsed(substream)) { spin_unlock(&chip->lock); snd_pcm_period_elapsed(substream); spin_lock(&chip->lock); } } spin_unlock(&chip->lock); #ifdef ECHOCARD_HAS_MIDI if (st > 0 && chip->midi_in) { snd_rawmidi_receive(chip->midi_in, chip->midi_buffer, st); dev_dbg(chip->card->dev, "rawmidi_iread=%d\n", st); } #endif return IRQ_HANDLED; } static snd_pcm_uframes_t pcm_pointer(struct snd_pcm_substream *substream) { struct snd_pcm_runtime *runtime = substream->runtime; struct audiopipe *pipe = runtime->private_data; u32 counter, step; /* * IRQ handling runs concurrently. Do not share tracking of * counter with it, which would race or require locking */ counter = le32_to_cpu(*pipe->dma_counter); /* presumed atomic */ step = counter - pipe->last_counter; /* handles wrapping */ pipe->last_counter = counter; /* counter doesn't neccessarily wrap on a multiple of * buffer_size, so can't derive the position; must * accumulate */ pipe->position += step; pipe->position %= frames_to_bytes(runtime, runtime->buffer_size); /* wrap */ return bytes_to_frames(runtime, pipe->position); } -- Mark