> On 18.04.2015, at 19:07, Mark Brown <broonie@xxxxxxxxxx> wrote: > > Sure, but with the fallback to a much longer sleeping delay that'd be > covered transparently anyway - it doesn't matter if we don't always > manage to busy wait under load, what's more important is that we don't > fail the operation as a whole. > > Actually if it's just scheduling that's a concern then one final check > after the time expires should do the trick shouldn't it? Well - I have tried now lots of things now and I have also started some diagnostics on timeouts as well as trying to detect reschedules and then report on the jiffies that have passed between start of loop and the check and I get values like this: [ 5867.943426] mcp251x spi32759.2: detected reschedule: 7 jiffies [ 5868.628859] mcp251x spi32759.2: detected reschedule: 8 jiffies [ 5869.867311] mcp251x spi32759.2: detected reschedule: 8 jiffies [ 5870.139622] mcp251x spi32759.0: detected reschedule: 8 jiffies [ 5870.272004] mcp251x spi32759.2: detected reschedule: 8 jiffies [ 5871.352704] mcp251x spi32759.0: detected reschedule: 5 jiffies [ 5871.469095] mcp251x spi32759.0: detected reschedule: 8 jiffies [ 5908.783971] mcp251x spi32759.2: detected reschedule: 75 jiffies [ 5955.831822] mcp251x spi32759.0: detected reschedule: 75 jiffies [ 5977.380447] mcp251x spi32759.2: detected reschedule: 82 jiffies [ 5977.602457] mcp251x spi32759.0: detected reschedule: 15 jiffies [ 5980.093570] mcp251x spi32759.2: detected reschedule: 2 jiffies [ 6006.559147] mcp251x spi32759.2: detected reschedule: 5 jiffies [ 6009.030543] mcp251x spi32759.0: detected reschedule: 8 jiffies [ 6016.872027] mcp251x spi32759.0: detected reschedule: 27 jiffies [ 6018.808468] mcp251x spi32759.2: detected reschedule: 82 jiffies [ 6021.586796] mcp251x spi32759.2: detected reschedule: 81 jiffies [ 6024.411264] mcp251x spi32759.2: detected reschedule: 77 jiffies [ 6033.868509] mcp251x spi32759.2: detected reschedule: 55 jiffies [ 6035.914697] mcp251x spi32759.0: detected reschedule: 81 jiffies [ 6051.216355] mcp251x spi32759.2: detected reschedule: 82 jiffies [ 6051.801190] mcp251x spi32759.0: detected reschedule: 55 jiffies [ 6052.075748] mcp251x spi32759.0: detected reschedule: 27 jiffies [ 6054.004605] mcp251x spi32759.0: detected reschedule: 81 jiffies [ 6068.815078] mcp251x spi32759.2: detected reschedule: 27 jiffies [ 6072.394473] mcp251x spi32759.2: detected reschedule: 81 jiffies [ 6081.257468] mcp251x spi32759.2: detected reschedule: 81 jiffies [ 6085.751653] mcp251x spi32759.2: detected reschedule: 82 jiffies [ 6111.253374] mcp251x spi32759.2: detected reschedule: 82 jiffies [ 6119.573181] mcp251x spi32759.2: detected reschedule: 74 jiffies [ 6131.105261] mcp251x spi32759.0: detected reschedule: 4 jiffies [ 6131.960492] mcp251x spi32759.2: detected reschedule: 82 jiffies [ 6132.808198] mcp251x spi32759.2: detected reschedule: 82 jiffies [ 6132.958858] mcp251x spi32759.2: detected reschedule: 8 jiffies [ 6137.992678] mcp251x spi32759.2: detected reschedule: 48 jiffies [ 6144.553370] mcp251x spi32759.2: detected reschedule: 34 jiffies [ 6146.819876] mcp251x spi32759.0: detected reschedule: 82 jiffies [ 6146.969923] mcp251x spi32759.2: detected reschedule: 8 jiffies [ 6152.984406] mcp251x spi32759.2: detected reschedule: 82 jiffies [ 6170.475425] mcp251x spi32759.0: detected reschedule: 74 jiffies [ 6174.954398] mcp251x spi32759.2: detected reschedule: 82 jiffies That happens when running torture CAN message receive test and concurrently taring the “*.o” files of the kernel - which produces lots of IO... But sometimes it also happens during other “minimal” activities - but that is a much rarer event. This is really the worsted case, as even those typical 8000 irq/s on an idle RPI sometimes drop down to 200 irqs/s (plus high block out counts) - and then the system is no very responsive to put it mildly... So the only solution seem to be: if a timeout of > 2 jiffies occurs fall back to interrupts. This way we are catering for the worsted case and at least our code should not make some drivers fail due to missing error-handling in the driver when the system is overloaded and produces this kind of delays. In the hope that this is an acceptable approach I will post a patch as soon as I have confirmed it to be working with a much longer torture-test that does not show any hiccups on the driver-side. -- To unsubscribe from this list: send the line "unsubscribe linux-spi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html