On Fri, 18 Sep 2009, Eric Paris wrote: > > Isolating udevd down to an interactivity scheduling change isn't _that_ > > bizarre. I think the setting of UDEVD_PRIORITY is already mostly > > arbitrary anyway and it'll allow 192 children on your 512M machine by > > default unless you changed UDEVD_MAX_CHILDS for uid 0. > > > > The default timeout for idle workers is 3 seconds, which may just happen > > to be long enough to panic your machine because of low memory. If that's > > the case, I don't believe that it's a scheduler issue but rather a root > > abuse of setting all udevd threads to be OOM_DISABLE. > > > > What is your udevd --version? The latest is udev-146 released last month. > > 145 > > Let me try and clone the vm some I don't break my reproducer. I'll see > if adding more memory fixes it. Doesn't look like Fedora has built a > -146 yet, I'll see if I can get one of those as well. > > udev bug, configuration issue, whatever, or not, it's a regression that > I used to be able to boot and updating my kernel leaves me unable to > boot. I think we all agree when 512M of memory isn't enough to boot to > runlevel 3 we've got a problem :) > I totally agree, and my hypothesis is that the idle child workers are not being killed in time that they quickly accumulate approaching UDEVD_MAX_CHILDS and when the oom killer is called because of a write to shared memory, it can't kill any of these threads either since udevd sets them all to OOM_DISABLE and everything else is an unkillable kthread. Bisecting that to a scheduler change would suggest that each udevd thread isn't returning from its poll() timeout fast enough; there's essentially a street race between udevd killing its own threads off because the poll timeout was exceeded and all your memory being used up and the machine panicking. The scheduling change seems to have affected the speed of the former. UDEVD_MAX_CHILDS defaults to 192 on your 512M machine unless overridden by an environment variable of the same name, so you may find it helpful to reduce this to a saner value. I'd suggest a value lower than the number of udevd threads that were shown in your latest oom killer dump. If that turns out to fix the issue for you, perhaps max_childs needs to be calculated in a slightly more conservative way in the userspace package since all threads come with the prerequisite of being OOM_DISABLE. -- To unsubscribe from this list: send the line "unsubscribe linux-hotplug" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html