At Tue, 11 May 2010 12:23:19 +0200 (CEST), Jaroslav Kysela wrote: > > On Tue, 11 May 2010, Takashi Iwai wrote: > > > At Tue, 11 May 2010 11:07:05 +0200 (CEST), > > Jaroslav Kysela wrote: > >> > >> On Tue, 11 May 2010, Takashi Iwai wrote: > >> > >>> At Tue, 11 May 2010 10:44:17 +0200 (CEST), > >>> Jaroslav Kysela wrote: > >>>> > >>>> On Tue, 11 May 2010, Takashi Iwai wrote: > >>>> > >>>>> At Tue, 11 May 2010 10:28:50 +0200 (CEST), > >>>>> Jaroslav Kysela wrote: > >>>>>> > >>>>>> On Mon, 10 May 2010, Jaroslav Kysela wrote: > >>>>>> > >>>>>>> Hi, > >>>>>>> > >>>>>>> I would like to ask HDA gurus to check this patch (I will include it > >>>>>>> to my tree once acked). The patch uses WALLCLK from HDA chips (marked as > >>>>>>> required in the HDA specification) to check for bogus - too early - interrups > >>>>>>> which confuses the upper PCM layer (sound skipping issues). > >>>>>>> > >>>>>>> More details about patch testing on problematic hardware can be found > >>>>>>> at: > >>>>>>> > >>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=15912 > >>>>>> > >>>>>> No objections received, so I pushed this code to my master/devel branches. > >>>>> > >>>>> Heh, that was quick, I had no time to review :) > >>>>> > >>>>> The change looks good to me. Thanks for your fix (and rebasing). > >>>>> > >>>>> One thing is that you removed bdl_pos_adj=0 check. This was there > >>>>> explicitly to avoid the delayed irq handling no matter whether the > >>>>> value is correct or not. I guess it's safer to take it back, or > >>>>> give any other way for user to avoid the delayed irq handling. > >>>> > >>>> I see. But I need the pos % period_bytes check too. What about this > >>>> change? > >>> > >>> With bdl_pos_adj==0, it should return 1 (non-error), but otherwise > >>> this should be OK, I guess. > >> > >> The problem with returning value 1 -> call snd_pcm_elapsed() - is that the > >> upper layer will be most probably confused on broken hw. From logs in > >> kernel bz#15912, the bad irq can be triggered at any time (I really don't > >> know why INTSTS register is set). > >> > >> The first wallclk check uses to check for first 2/3 of period and the > >> pos % period_bytes check checks for second half of period. Writing this, I > >> think that the extra check can be added to the second condition to do > >> position related check for first next period only: > > > > Isn't a similar check done in PCM core side? > > Filtering or delaying ack is perhaps better done here, though... > > Unfortunately, the check in PCM core expects that at least 'period_time' > is elapsed between two snd_pcm_elapsed() calls: > > if (in_interrupt) { > /* we know that one period was processed */ > /* delta = "expected next hw_ptr" for in_interrupt != 0 */ > delta = runtime->hw_ptr_interrupt + runtime->period_size; > > In situation, when the driver calls the snd_pcm_elapsed() too early, > the pointers will be updated wrongly (whole ring buffer size will be > added to hw_ptr). Yeah, that's bad. > >> diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c > >> index 0a6c55b..d4d532a 100644 > >> --- a/sound/pci/hda/hda_intel.c > >> +++ b/sound/pci/hda/hda_intel.c > >> @@ -1890,9 +1890,8 @@ static int azx_position_ok(struct azx *chip, struct azx_dev *azx_dev) > >> unsigned int pos; > >> int stream; > >> > >> - wallclk = azx_readl(chip, WALLCLK); > >> - if ((wallclk - azx_dev->start_wallclk) < > >> - (azx_dev->period_wallclk * 2) / 3) > >> + wallclk = azx_readl(chip, WALLCLK) - azx_dev->start_wallclk; > >> + if (wallclk < (azx_dev->period_wallclk * 2) / 3) > >> return -1; /* bogus (too early) interrupt */ > >> > >> stream = azx_dev->substream->stream; > >> @@ -1910,9 +1909,12 @@ static int azx_position_ok(struct azx *chip, struct azx_dev *azx_dev) > >> > >> if (WARN_ONCE(!azx_dev->period_bytes, > >> "hda-intel: zero azx_dev->period_bytes")) > >> - return 0; /* this shouldn't happen! */ > >> - if (pos % azx_dev->period_bytes > azx_dev->period_bytes / 2) > >> - return 0; /* NG - it's below the period boundary */ > >> + /* this shouldn't happen! */ > >> + return bdl_pos_adj[chip->dev_index] ? 0 : -1; > >> + if (wallclk <= azx_dev->period_wallclk && > >> + pos % azx_dev->period_bytes > azx_dev->period_bytes / 2) > >> + /* NG - it's below the first next period boundary */ > >> + return bdl_pos_adj[chip->dev_index] ? 0 : -1; > >> azx_dev->start_wallclk = wallclk; > >> return 1; /* OK, it's fine */ > >> } > >> > >> I don't see any reason to call snd_pcm_elapsed() in bad time. It's better > >> to return -1 (ignore) and wait for next interrupt. > > > > Hm, judging from the current situation, a missing IRQ is harmless than > > the wrong position ack, indeed. > > > > But delaying the period handling is a bit different from filtering > > bogus IRQs. I'm not entirely sure which is the cause in the bug > > above. If it's a bogus IRQ, could it be filtered out. If it's an IRQ > > at wrong time (too early), the delayed ack is needed. If it's the > > wrong position read, the delayed ack isn't needed but the position > > correction is needed. > > > > So, I'm a bit afraid of adding too much checks that might give a > > reaction based on wrong information. We'd need to take a detailed > > look again at each cause... > > I'm also not so much happy to have this code in the driver, but it seems > that HDA hardware does not respect the standard behaviour in some cases. > Anyway, it's good to have the code robust. > > > BTW, now I looked back again, and think > > WARN_ONCE(!azx_dev->period_bytes) should be also returning -1 for > > normal cases, no? This is just a sanity check, and if this happens, > > the chance that you get the right IRQ at the next moment isn't high, > > since something got screwed up. > > I agree. I put all suggested changes to my tree into "[ALSA] > snd-hda-intel: Improve azx_position_ok()" commit. Thanks! I pulled now this, too. Takashi _______________________________________________ Alsa-devel mailing list Alsa-devel@xxxxxxxxxxxxxxxx http://mailman.alsa-project.org/mailman/listinfo/alsa-devel