On Mon, 2009-12-14 at 11:24 +0100, Kay Sievers wrote: > On Mon, Dec 14, 2009 at 02:12, Jon Masters <jonathan@xxxxxxxxxxxxxx> wrote: > > On Sat, 2009-12-12 at 04:29 -0500, Jon Masters wrote: > > > >> FYI I am diagnosing an unlikely locking issue on a RHEL system right now > >> that triggers when you have a read-only filesystem and can't use file > >> locks. I know we ripped out a lot of that code since in upstream but I > >> might need to address this differently. I'll followup with my findings. > > > > I have written a sysV shared memory locking mechanism that using an shm > > segment in the case that the filesystem is mounted read-only. I'll > > finish the work on the RHEL bug and then forward port it, and post. We > > can probably re-introduce this because it works in a read-only context > > and can't be abused by a user to prevent module loading because the IPC > > shm segment is user-specific. Only downside is (optional, I guess) > > dependency on sysV shm. But I don't think we honestly have anyone > > wanting to use m-i-t on a system without such things compiled in. > > Why would we need anything like that? Shouldn't the load syscall > serialize all that properly? It doesn't quite though (in the 2.6.18 kernel I'm focusing on right now, upstream has some slight changes in the module loader code since then). We only run under stop machine during the actual link and unlink, so there's a possibility for a second modprobe to come along while another is running. The existing code upstream was ripped out because there is a possibility for the flocking to be abused by a user opening the module file and because it's not normally a big deal if a second modprobe comes along - worst case the module is already loaded by another instance and we will fail harmlessly to load it a second time. Totally no problem. But you know how these Enterprise distros are ;) there's a harmless warning generated in the system kernel log in case that you have multiple modprobes running at the same time trying to load a module with deps that will fail because the hardware isn't available, and you're on a root filesystem without the possibility of the previous locking code (still in there) working(!) :P The old code even accounts for this by spinning in the procfs reading code, looking for the "Loading" and "Unloading" state change. The warning is harmless, but anyway. We probably don't care in upstream as a vary contrived case that causes no harm isn't enough to justify that locking be re-introduced. But I'm sharing mostly for your interest at this point. > The only thing that should happen is needless work, that the failing > syscall tells us to discard -- which is probably more reliable Right. Except in the case that you are loading a useless module (with a dep that will fail) in the first place. In that case there seems to still be a rare race (again, this is older 3.3 on an old distro, and both the kernel and upstream have changed so I am going to get back to seeing how we're affected now) in which we'll manage to convince ourselves that one of the deps is loaded (it loads, but fails during init because the hardware isn't available, though we'll run the scheduler in between after we link it in before calling init, so another modprobe might run) long enough to attempt to insert the module itself and then fail to find one of the dependent symbols. Rusty did some fixes to symbol resolution I'm also looking at and this might turn out to be fixed there. Anyway, it's all harmless in what happens but I thought I'd mention I'm thinking about locking and whether we should revive something for contrived issues. Jon. -- To unsubscribe from this list: send the line "unsubscribe linux-modules" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html