On Mon, Dec 14, 2009 at 12:30, Jon Masters <jonathan@xxxxxxxxxxxxxx> wrote: > On Mon, 2009-12-14 at 11:24 +0100, Kay Sievers wrote: >> On Mon, Dec 14, 2009 at 02:12, Jon Masters <jonathan@xxxxxxxxxxxxxx> wrote: >> > On Sat, 2009-12-12 at 04:29 -0500, Jon Masters wrote: >> > >> >> FYI I am diagnosing an unlikely locking issue on a RHEL system right now >> >> that triggers when you have a read-only filesystem and can't use file >> >> locks. I know we ripped out a lot of that code since in upstream but I >> >> might need to address this differently. I'll followup with my findings. >> > >> > I have written a sysV shared memory locking mechanism that using an shm >> > segment in the case that the filesystem is mounted read-only. I'll >> > finish the work on the RHEL bug and then forward port it, and post. We >> > can probably re-introduce this because it works in a read-only context >> > and can't be abused by a user to prevent module loading because the IPC >> > shm segment is user-specific. Only downside is (optional, I guess) >> > dependency on sysV shm. But I don't think we honestly have anyone >> > wanting to use m-i-t on a system without such things compiled in. >> >> Why would we need anything like that? Shouldn't the load syscall >> serialize all that properly? > > It doesn't quite though (in the 2.6.18 kernel I'm focusing on right now, > upstream has some slight changes in the module loader code since then). > We only run under stop machine during the actual link and unlink, I don't think we ever run under stop_machine, only when an error occurs, we do use it, otherwise we never actual stopping and just clean up the prepared stop machine thread. It's all protected only by a mutex, I think. > so > there's a possibility for a second modprobe to come along while another > is running. The existing code upstream was ripped out because there is a > possibility for the flocking to be abused by a user opening the module > file and because it's not normally a big deal if a second modprobe comes > along - worst case the module is already loaded by another instance and > we will fail harmlessly to load it a second time. Totally no problem. > > But you know how these Enterprise distros are ;) there's a harmless > warning generated in the system kernel log in case that you have > multiple modprobes running at the same time trying to load a module with > deps that will fail because the hardware isn't available, and you're on > a root filesystem without the possibility of the previous locking code > (still in there) working(!) :P The old code even accounts for this by > spinning in the procfs reading code, looking for the "Loading" and > "Unloading" state change. The warning is harmless, but anyway. Yeah, maybe this condition should be detected and the warning should be fixed then. :) > We probably don't care in upstream as a vary contrived case that causes > no harm isn't enough to justify that locking be re-introduced. But I'm > sharing mostly for your interest at this point. Sounds good. I still think that this kind of locking/serialization belongs into the kernel only. If we need locking in modprobe to make modprobe reliable, we probably do something wrong. >> The only thing that should happen is needless work, that the failing >> syscall tells us to discard -- which is probably more reliable > > Right. Except in the case that you are loading a useless module (with a > dep that will fail) in the first place. In that case there seems to > still be a rare race (again, this is older 3.3 on an old distro, and > both the kernel and upstream have changed so I am going to get back to > seeing how we're affected now) in which we'll manage to convince > ourselves that one of the deps is loaded (it loads, but fails during > init because the hardware isn't available, though we'll run the > scheduler in between after we link it in before calling init, so another > modprobe might run) long enough to attempt to insert the module itself > and then fail to find one of the dependent symbols. > > Rusty did some fixes to symbol resolution I'm also looking at and this > might turn out to be fixed there. Anyway, it's all harmless in what > happens but I thought I'd mention I'm thinking about locking and whether > we should revive something for contrived issues. Fine. I just want to be careful for the current version to add something back that, if needed, likely just works around the real issue that should be fixed instead. Thanks, Kay -- To unsubscribe from this list: send the line "unsubscribe linux-modules" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html