Search Linux Wireless

Re: ath5k: kernel timing screwed - due to unserialised register access?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> On Wed, 15 Oct 2008, Elias Oltmanns wrote:
>> Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
>> > On Sat, 11 Oct 2008, Elias Oltmanns wrote:
>> >> > Compile the acpi_processor module in to the kernel
>> >
>> >> > (CONFIG_ACPI_PROCESSOR=y) and add processor.max_cstate=1 to the kernel
>> >> > command line. If I analysed the problem correctly this will make the
>> >> > jiffies problem go away. I'm working on a fix.
>> >> 
>> >> Spot on, it does go away regardless whether NO_HZ or HIGH_RES are
>> >> enabled or disabled. Looking forward to testing your fix ;-).
>> >
>> > Here you go.
>> 
>> Bad luck, I'm afraid. Your patch seems to fix the issue for NO_HZ=n +
>> HIGH_RES=y. As soon as NO_HZ=y, however, the problem reappears. See the
>
> Hmm. You should have seen the same problem w/o that patch in the
> "CONFIG_ACPI_PROCESSOR=y / processor.max_cstate=1" test with NOHZ=y.

Right you are, I've just retested that case. The reason why I didn't hit
this the last time is that it seems to be triggered far less often than
without the processor.max_cstate=1 option.

>
> Anyway, I can see what the NOHZ problem is. Updated patch below.

Nice, it seems to have done the trick. I'll keep an eye on my logs to
make sure it doesn't pop up again.

>
>> output below. (Still testing on 2.6.27, mind. Should I test something
>> more up-to-date?)
>
> .27 is fine. 
>
> Did you make any progress finding out why the ath5k softirq runs for
> >20ms ?  We need to fix this madness as well :)

Well, it wasn't obvious to me so far, whether the logs really indicated
that 20 msecs had been spent in te callback or whether all this was due
to the bug in the timing code. With your patch applied, I have
eventually made further investigations into the matter. The problem is
the following snippet from
drivers/net/wireless/ath5k/phy.c:ath5k_hw_noise_floor_calibration():

	/*
	 * Enable noise floor calibration
	 */
	AR5K_REG_ENABLE_BITS(ah, AR5K_PHY_AGCCTL,
				AR5K_PHY_AGCCTL_NF);

	ret = ath5k_hw_register_timeout(ah, AR5K_PHY_AGCCTL,
			AR5K_PHY_AGCCTL_NF, 0, false);

The first call sets a bit in the AR5K_PHY_AGCCTL register and the second
waits for that bit to be cleared by the hardware again. Apparently, it
takes roughly 20 ms to clear that bit.

In order to execute ath5k_hw_noise_floor_calibration() in process
context, I'd suggest introducing a single threaded workqueue for the
ath5k driver and scheduling calibration from the calib_timer callback.
Additionally, it would be necessary to schedule resets in a similar
manner instead of using the ath5k_tasklet_reset() tasklet. This requires
some serialisation but in my opinion there are various serialisation
issues in ath5k as it is that need fixing. Unfortunately, none of the
concerns I have raised wrt the ath5k driver seem to have resulted in a
commit that I'm aware of even though patches have been supplied. Perhaps
it's the merge window that consumes people's resources. Maybe I'll post
a patch addressing this particular issue in the next few days.

Regards,

Elias
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]
  Powered by Linux