On Wed, 2012-09-19 at 15:08 +0200, Joerg Roedel wrote: > On Tue, Sep 18, 2012 at 01:42:49PM -0600, Shuah Khan wrote: > > Are you ok with the system wide and per device error counts I added? Any > > comments on the overall approach? > > The general approach of having error counters is fine. But the addresses > allocated/addresses checked thing should be done per allocation and not > with counter comparison for several reasons: > > 1. When doing it per-allocation we know exactly which allocation > was not checked and can tell the driver developer. The code > saves stack-traces for that. This is much more useful than > telling the developer 'somewhere you do not check your > dma-handles' Right. It would point directly the actual mapping instead of a blind count. > > 2. Checking this per-allocation gives you the per-device and > also the per-driver checking you want. Yes it would. > > 3. You don't need to change 'struct device' for that. Right - heard from others as well on this one :) > > There are more reasons, like that this approach fits a lot better to the > general idea of the DMA-API debugging code. > > > The approach you suggested will cover the cases where drivers fail to > > check good map cases. We won't able to catch failed maps that get used > > without checks. Are you not concerned about these cases? These could > > cause a silent error with wild writes or could bring the system down. Or > > are you recommending changing the infrastructure to track failed maps as > > well? > > It is fine to only check the good-map cases. Think about what > DMA-debugging is good for: It is a tool for driver developers to find > bugs in their code they wouldn't notice otherwise. An unchecked bad-map > case is a bug they would notice otherwise. So if we check only the > good-map cases and warn the driver developers about non-checked > addresses they fix it and make the drivers more robust against failed > allocations, fixing also the bad-map cases. ok makes sense now that understand the scope of the dma-debug api. Here is what I will do then, do checks on good maps. With that scope, there is no need for another table. > > > I am still pursuing a way to track failed map cases. I combined the flag > > idea with one of the ideas I am looking into. Details below: (if this > > sounds like a reasonable approach, I can do v2 patch and we can discuss > > the code) > > Why do you want to track the bad-map cases? I am still concerned about data corruption type issues that will be hard to debug and hoping having a error count might be an indicator. However, I agree with what you said about not having the actual mapping association is not very useful. -- Shuah -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html