On Thu, May 2, 2013 at 11:17 AM, Chris Wilson <chris at chris-wilson.co.uk> wrote: > On Thu, May 02, 2013 at 12:00:03PM +0300, Imre Deak wrote: >> Due to possible scheduling latencies wait_event_timeout doesn't >> guarantee a non-zero return value, even if the condition becomes true >> before the specified timeout expires. Thus we can incorrectly signal a >> timeout and abort a DP AUX transaction. >> >> If wait_event_timeout returns 0, it's guaranteed that at least the >> specified timeout (minus one jiffies, see below) had passed, so we can >> fix this by checking the condition explicitly in this case. >> >> Also the timeout that wait_event_timeout() is guaranteed to wait if the >> condition doesn't become true is one less jiffies than what is passed to >> it as a parameter. This is because the absolute expiration time in >> schedule_timeout() may be calculated at a moment close to the next >> scheduling tick, when jiffies is incremented. So make sure we pass always >> a jiffies value of 2 or greater. Here this makes a difference only for >> HZ=100. >> >> This fixes DP AUX errors I saw during booting on an ILK. >> >> This should ideally be fixed in wait_event_timeout(), but that can take >> a while. Until that's done use this fix as a band-aid. > > As we have 3 such vulnerable callsite in our driver alone, perhaps we > should push for your general fix. Yeah, I think this should be fixed in general, doesn't really make much sense given that the jiffies value is relative this way. Also, the actual timeout we need to wait is just 400 us, so I think we're pretty safe here. Can you please resumbit just the recheck part of your patch? Also, can you do the same fix for gmbus_wait_idle in intel_i2c.c please? Thanks, Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch