Re: [PATCH 1/6] kmod: add dynamic max concurrent thread count

"Luis R. Rodriguez" <mcgrof@xxxxxxxxxx> · Thu, 25 May 2017 10:38:40 -0700

On Thu, May 25, 2017 at 10:30 AM, Dmitry Torokhov
<dmitry.torokhov@xxxxxxxxx> wrote:
> On Thu, May 25, 2017 at 09:50:15AM -0700, Luis R. Rodriguez wrote:
>> On Thu, May 25, 2017 at 9:38 AM, Dmitry Torokhov
>> <dmitry.torokhov@xxxxxxxxx> wrote:
>> > On Thu, May 25, 2017 at 06:22:01PM +0200, Luis R. Rodriguez wrote:
>> >> On Fri, May 19, 2017 at 02:58:29PM -0700, Dmitry Torokhov wrote:
>> >> > On Fri, May 19, 2017 at 02:45:29PM -0700, Luis R. Rodriguez wrote:
>> >> > > On May 19, 2017 1:45 PM, "Dmitry Torokhov" <dmitry.torokhov@xxxxxxxxx>
>> >> > > wrote:
>> >> > >
>> >> > > On Thu, May 18, 2017 at 08:24:39PM -0700, Luis R. Rodriguez wrote:
>> >> > > > We currently statically limit the number of modprobe threads which
>> >> > > > we allow to run concurrently to 50. As per Keith Owens, this was a
>> >> > > > completely arbitrary value, and it was set in the 2.3.38 days [0]
>> >> > > > over 16 years ago in year 2000.
>> >> > > >
>> >> > > > Although we haven't yet hit our lower limits, experimentation [1]
>> >> > > > shows that when and if we hit this limit in the worst case, will be
>> >> > > > fatal -- consider get_fs_type() failures upon mount on a system which
>> >> > > > has many partitions, some of which might even be with the same
>> >> > > > filesystem. Its best to be prudent and increase and set this
>> >> > > > value to something more sensible which ensures we're far from hitting
>> >> > > > the limit and also allows default build/user run time override.
>> >> > > >
>> >> > > > The worst case is fatal given that once a module fails to load there
>> >> > > > is a period of time during which subsequent request for the same module
>> >> > > > will fail, so in the case of partitions its not just one request that
>> >> > > > could fail, but whole series of partitions. This later issue of a
>> >> > > > module request failure domino effect can be addressed later, but
>> >> > > > increasing the limit to something more meaninful should at least give us
>> >> > > > enough cushion to avoid this for a while.
>> >> > > >
>> >> > > > Set this value up with a bit more meaninful modern limits:
>> >> > > >
>> >> > > > Bump this up to 64  max for small systems (CONFIG_BASE_SMALL)
>> >> > > > Bump this up to 128 max for larger systems (!CONFIG_BASE_SMALL)
>> >> > > >
>> >> > > > Also allow the default max limit to be further fine tuned at compile
>> >> > > > time and at initialization at run time at boot up using the kernel
>> >> > > > parameter: max_modprobes.
>> >> > > >
>> >> > > > [0] https://git.kernel.org/cgit/linux/kernel/git/history/
>> >> > > history.git/commit/?id=ab1c4ec7410f6ec64e1511d1a7d850fc99c09b44
>> >> > > > [1] https://github.com/mcgrof/test_request_module
>> >> > >
>> >> > > If we actually run into this issue, instead of slamming the system with
>> >> > > bazillion concurrent requests, can we wait for the other modprobes to
>> >> > > finish and then continue?
>> >> > >
>> >> > >
>> >> > > Yes ! That I have a patch that does precisely that ! That is actually still
>> >> > > *not enough* to not fail fatally but this would be subject of another
>> >> > > series with more debatable approaches.
>> >> > >
>> >> >
>> >> > Then please post it.
>> >>
>> >> Will do.
>> >>
>> >> > > This at least pushes us to closer safer limits for now while also making it
>> >> > > configurable.
>> >> >
>> >> > Making it configurable depending on how big/little box is makes no
>> >> > sense,
>> >>
>> >> If we set a hard limit then we need to patch a system if we need to increment
>> >> it. This is rather stupid given we have no current heuristics to make kmod
>> >> loading deterministic from userspace, and in the worst case this can be fatal.
>> >> General system size is a good first guess, but making it configurable is
>> >> really key given current limitations. I'll post further patches which reveals
>> >> some of these issues more clearly.
>> >>
>> >> > especially if the above is implemented, as depth of modprobe
>> >> > invocations depends on configuration and not computing power of the
>> >> > hardware the system is running on.
>> >>
>> >> You seem to agree making it configurable is sensible , but not depending on
>> >> the system size ?
>> >
>> > No, I am saying that making it configurable based on system size makes
>> > no sense at all, and making it configurable given you already have
>> > patches removing hard failures gives no benefit.
>>
>> Ah no, the problem is that hard failures are not yet removed in this
>> patch set at all! This series only contains the things I thought were
>> non-radical really.
>
> I know they are not removed in this patch set.
>
>>
>> In fact even with the subsequent patches from my 2nd series I'll
>> eventually post post -- these fatal issues are not cured at all unless
>> we dance with userspace a bit, or unless as you suggest we have *all*
>> pending theads wait without killing any.
>
> Well, that is too bad, I understood you already implemented what I
> suggested.

We seem to want the same so that is good actually.

>> *This* small patch should enable folks to move the needle to a more
>> *fair* limit, its also useful to backport a simple fix even if the
>> other stuff is not merged, *but* it *also* provides a way for systems
>> to move away from the slippery slope if they know what they are doing.
>
> Look, you are trying to push a band-aid solution for a problem that is
> purely theoretical (as you say in your patch description we are not
> hitting this problem in practice, only your test does).

I have a stress test driver which reveals we can easily hit it. In
practice the only way to know if we hit the limit is a:
"request_module: runaway loop modprobe %s" message on dmesg, however
its fatal, how often people inspect a kernel log to see if that came
up though... not sure. So a module could not be loaded and we may not
realize it.

> There is
> no slippery slope for systems to move away, no need to backport
> anything. We seem to agree that a better solution is possible (throttle
> number of concurrently running modprobes without killing requesters),
> and with that solution the band-aid will no longer be needed.
>
> So please implement and post the proper fix for the issue.

Alright, will do away with this patch and just go for the jugular of the issue.

 Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html