Re: Reporting bugs and bisection

Arjan van de Ven <arjan@xxxxxxxxxxxxx> · Mon, 14 Apr 2008 11:24:44 -0700

On Mon, 14 Apr 2008 10:51:52 -0700
Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:

> Well OK.  But I don't think we can generalise from oops-causing bugs

including all WARN_ON's and various other kernel backtrace-causing bugs.

> all the way to all bugs.  Very few bugs actually cause oopses, and
> oopses tend to be the thing which developers will zoom in on and pay
> attention to.

maybe.
> 
> If we had metrics on "time goes backwards" or anything containing
> "ASUS", things might be different.

Sounds really like we need to add more strategic WARN_ON's and other diagnostics in 
the kernel to track these issues down.

Because another thing that I found so far is that what hits LKML is by far not representative
on what happens for users. The most obvious example was the whole input layer refcounting disaster
in 2.6.25-rc; this was about 1/3rd of TOTAL reports for a few weeks in a row, but there
was hardly an LKML posting for it (in fact there was only 1 half one).
We need diagnostics and stuff the kernel spits out so that automated tools can detect these,
otherwise we'll very likely not get good information on what is actually wrong with the kernel.

In case you want to see the 2.6.25-rc data, the top 100 list is at
http://www.kerneloops.org/twentyfive.html

(I'm still working on annotating the individual items, but since there's 100
that does take time)

-- 
If you want to reach me at my work email, use arjan@xxxxxxxxxxxxxxx
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html