On 02/14/2012 02:29 PM, Felix Fietkau wrote:
On 2012-02-14 7:43 PM, Ben Greear wrote:
On 02/14/2012 10:29 AM, Sujith wrote:
Ben Greear wrote:
Actually, I think it might be useful to have a second level of debugging.
I hope to soon have time& resources to add some logic to dump lots of register
info and such in human-readable format, (like, when DMA times out). That is going to be a lot
of strings added to the driver, so the compile size will definitely
increase. If keeping the size small is important, then this sort of verbose thing
could be hidden behind a second level of debugging...
That could be implemented similar to what usbmon does. A debugfs file that could
be read and redirected to a file. And there would be no overhead to the
driver, I think. We could call it the 'event log'. :)
I was thinking about adding a method that grabbed as many registers
as I have info for and dumping them with printk when DMA errors
hit. This would make kernel splats more useful.
And also have a debugfs file called 'registers' or similar that one
could cat out and get similar info. And this can let folks look
at steady-state or whatever.
But, the logic to turn the register bit values into strings would
be in the driver (and thus add some code size bloat).
My hope is that this would allow a better chance of understanding
the stop-DMA errors that some people get reliably (but which I can never reliably
reproduce).
I'm not sure how that plays into your 'event log' idea, but maybe
one will help the other.
I think the 'let's dump all kinds of random crap when the issue occurs
until we find somebody that can parse it' approach won't work here, and
I really think it's not a good idea in general.
In the past the stop-DMA crap has been a symptom with a wild variety of
different causes, most of which were actually *software* race
conditions, e.g. dma tx or rx enable during reset, locking issues, etc.
I'm interested in parsing it. There are folks that can reproduce
this bug every time, and it seems none of the developers can reproduce
it. So, the only thing I can think to do is to try to get more info
from the folks that see the problem.
The good news is that some of them are desperate enough to run my
hacked kernel, so if such patches are not wanted upstream,
I'll just put it in my tree and see if I can get any useful
info from them that way...
Let's not carpet-bomb the driver with lots of debug crap that probably
won't ever lead anybody to any good solution for the remaining issues,
let's fix stuff the old-fashioned way: by reading the code,
understanding what's going on, analyzing problems in a systematic way,
rather than clouding the whole process with assumptions based on old
bugs that have since been fixed.
That sounds good, and I hope folks are doing that. But as for me,
unless I can reproduce a problem I don't think I'll be able to
do much, as I don't understand the code that well
and I don't have any access to folks who know the details
of the hardware and such...
Thanks,
Ben
--
Ben Greear <greearb@xxxxxxxxxxxxxxx>
Candela Technologies Inc http://www.candelatech.com
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html