On Mon, Aug 07, 2017 at 11:26:09AM +0100, Matt Redfearn wrote: > Hi Luis, > On 03/08/17 00:23, Luis R. Rodriguez wrote: > > On Tue, Aug 01, 2017 at 07:28:20PM -0700, Kees Cook wrote: > > > On Tue, Aug 1, 2017 at 5:12 PM, Luis R. Rodriguez <mcgrof@xxxxxxxxxx> wrote: > > > > On Fri, Jul 21, 2017 at 03:05:20PM +0100, Matt Redfearn wrote: > > > > > Commit 6d7964a722af ("kmod: throttle kmod thread limit") which was > > > > > merged in v4.13-rc1 broke this behaviour since the recursive modprobe is > > > > > no longer caught, it just ends up waiting indefinitely for the kmod_wq > > > > > wait queue. Hence the kernel appears to hang silently when starting > > > > > userspace. > > > > Indeed, the recursive issue were no longer expected to exist. > > > Errr, yeah, recursive binfmt loads can still happen. > > > > > > > The *old* implementation would also prevent a set of binaries to daisy chain > > > > a set of 50 different binaries which require different binfmt loaders. The > > > > current implementation enables this and we'd just wait. There's a bound to > > > > the number of binfmd loaders though, so this would be bounded. If however > > > > a 2nd loader loaded the first binary we'd run into the same issue I think. > > > > > > > > If we can't think of a good way to resolve this we'll just have to revert > > > > 6d7964a722af for now. > > > The weird but "normal" recursive case is usually a script calling a > > > script calling a misc format. Getting a chain of modprobes running, > > > though, seems unlikely. I *think* Matt's patch is okay, but I agree, > > > it'd be better for the request_module() to fail. > > In that case how about we just have each waiter only wait max X seconds, > > if the number of concurrent ongoing modprobe calls hasn't reduced by > > a single digit in X seconds we give up on request_module() for the > > module and clearly indicate what happened. > > > > Matt, can you test? > > Sure - I've tested patch this on Cavium Octeon under the same conditions as > before (64 bit kernel with 32bit userspace & no binfmt handler builtin). > > The failing modprobe is now caught and rather than silence we get the > expected kernel panic, albeit all the warnings look quite noisy. Thanks for testing! I agree its all too verbose, I'll clean that up and resubmit with a cleaner log. I tried to devise a test case for this but for the life of me I could not. If you happen to come up with something please feel free to submit one for lib/test_kmod.c! > In any case, this is better than the 4.13-rc1 behavior, so > > Tested-by: Matt Redfearn <matt.redfearn@xxxxxxxxxx> Luis