On Fri, Sep 5, 2014 at 12:19 AM, Tejun Heo <tj@xxxxxxxxxx> wrote: > On Thu, Sep 04, 2014 at 11:37:24PM -0700, Luis R. Rodriguez wrote: > ... >> + /* >> + * I got SIGKILL, but wait for 60 more seconds for completion >> + * unless chosen by the OOM killer. This delay is there as a >> + * workaround for boot failure caused by SIGKILL upon device >> + * driver initialization timeout. >> + * >> + * N.B. this will actually let the thread complete regularly, >> + * wait_for_completion() will be used eventually, the 60 second >> + * try here is just to check for the OOM over that time. >> + */ >> + WARN_ONCE(!test_thread_flag(TIF_MEMDIE), >> + "Got SIGKILL but not from OOM, if this issue is on probe use .driver.async_probe\n"); >> + for (i = 0; i < 60 && !test_thread_flag(TIF_MEMDIE); i++) >> + if (wait_for_completion_timeout(&done, HZ)) >> + goto wait_done; >> + > > Ugh... Jesus, this is way too hacky, so now we fail on 90s timeout > instead of 30? Nope! I fell into the same trap and only with tons of patience by part of Tetsuo with me was I able to grok that the 60 seconds here are not for increasing the timeout, this is just time spent checking to ensure that the OOM wasn't the one who triggered the SIGKILL. Even if the drivers took eons it should be fine now, I tried it :D > Why do we even need this with the proposed async > probing changes? Ah -- well without it the way we "find" drivers that need this new "async feature" is by a bug report and folks saying their system can't boot, or they say their device doesn't come up. That's all. Tracing this to systemd and a timeout was one of the most ugliest things ever. There two insane bug reports you can go check: mptsas was the first: http://article.gmane.org/gmane.linux.kernel/1669550 https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1297248 Then cxgb4: https://bugzilla.novell.com/show_bug.cgi?id=877622 I only had Cc'd you on the newest gem pata_marvell : https://bugzilla.kernel.org/show_bug.cgi?id=59581 We can't seriously expect to be doing all this work for every driver. a WARN_ONCE() would enable us to find the drivers that need this new async probe "feature". Luis -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html