Re: [PATCHv2] wlcore: fix race for WL1271_FLAG_IRQ_RUNNING

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Kalle Valo <kvalo@xxxxxxxxxxxxxx> [191008 14:17]:
> Tony Lindgren <tony@xxxxxxxxxxx> writes:
> 
> > * Tony Lindgren <tony@xxxxxxxxxxx> [191007 17:29]:
> >> We set WL1271_FLAG_IRQ_RUNNING in the beginning of wlcore_irq(), and test
> >> for it in wlcore_runtime_resume(). But WL1271_FLAG_IRQ_RUNNING currently
> >> gets cleared too early by wlcore_irq_locked() before wlcore_irq() is done
> >> calling it. And this will race against wlcore_runtime_resume() testing it.
> >> 
> >> Let's set and clear IRQ_RUNNING in wlcore_irq() so wlcore_runtime_resume()
> >> can rely on it. And let's remove old comments about hardirq, that's no
> >> longer the case as we're using request_threaded_irq().
> >> 
> >> This fixes occasional annoying wlcore firmware reboots stat start with
> >> "wlcore: WARNING ELP wakeup timeout!" followed by a multisecond latency
> >> when the wlcore firmware gets wrongly rebooted waiting for an ELP wake
> >> interrupt that won't be coming.
> >> 
> >> Note that I also suspect some form of this issue was the root cause why
> >> the wlcore GPIO interrupt has been often configured as a level interrupt
> >> instead of edge as an attempt to work around the ELP wake timeout errors.
> >
> > So this fixed a reproducable test case where loading some webpages
> > often produced ELP timeout errors. But looks like I'm still seeing ELP
> > timeouts elsewhere. So best to wait on this one. Something is still
> > wrong with the ELP timeout handling.
> 
> Ok, I'll drop this then. Please send v3 once you think the patch is
> ready to be applied.

Looks like the real fix is to use level instead of edge interrupt
for omap4 and 5 to avoid the check for untriggered interrupts in
omap_gpio_unidle(). Should not be needed for other SoCs as their
l4per can't idle independent of the CPUs.

I'll send a separate patch for that. And I'll send an updated clean-up
patch for $subject patch as the race described above should never
happen.

The clearing of WL1271_FLAG_IRQ_RUNNING bit happens already within
pm_runtime_get_sync() section of wlcore_irq_locked(). So this patch just
happened to sligthly change the timings for my reproducable test case.
We should not be able to hit the race described above even with super
short autosuspend timeouts between wlcore_irq_locked() and the end of
wlcore_irq() :)

Regards,

Tony


> -- 
> https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches



[Index of Archives]     [Linux Arm (vger)]     [ARM Kernel]     [ARM MSM]     [Linux Tegra]     [Linux WPAN Networking]     [Linux Wireless Networking]     [Maemo Users]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux