On 2010-06-29 11:40 PM, Björn Smedman wrote: > 2010/6/29 Björn Smedman <bjorn.smedman@xxxxxxxxxxx>: >> Yes, hw reset is due to reg = 0x01702400 every 4 - 40 seconds or so: >> ... > > When the chip is really stuck, does 'reg' (at 'return false') change > over time? If I add a second requirement that ath9k_hw_check_alive() > must fail three times in a row (in different invocations of > ath9k_tasklet()) before we reset the chip the ap seems fine. I > sometimes get several of these reg = 0x01702400 every second but only > one or at the max two in a row. > > Under load I sometimes see some reg = 0x00f02400 as well. I also see > an occasional reset now and then (about once a minute) that must be > caused by something else. > > Any insight into what these reg values mean? Do you think they can > safely be ignored as per above? I had a similar thought about the multiple invocations thing. I think that's a good approach in general, but we need to ensure that we make it safe. The main point of this function is to detect baseband hangs. If we experience such a hang, I'm not sure we will always get enough interrupts to do multiple consecutive tests. One way to make it safe would be to reschedule the tasklet each time we ignore the result of the ath9k_hw_check_alive(), that way we keep the detection time low as well. Maybe we could also use a timer for leaving 10 ms time between attempts. Another thing that I'm working on right now is to ensure that the TSF gets preserved across resets. For some AR9280 based cards the code already preserves TSF in software over the chip reset, I could simply extend that to cover SoC as well. But before I post such a patch, I'll do a test on AR9160 - to see if it would be better to make the TSF preserve unconditional. - Felix -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html