Re: module locking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 14, 2009 at 12:30, Jon Masters <jonathan@xxxxxxxxxxxxxx> wrote:
> On Mon, 2009-12-14 at 11:24 +0100, Kay Sievers wrote:
>> On Mon, Dec 14, 2009 at 02:12, Jon Masters <jonathan@xxxxxxxxxxxxxx> wrote:
>> > On Sat, 2009-12-12 at 04:29 -0500, Jon Masters wrote:
>> >
>> >> FYI I am diagnosing an unlikely locking issue on a RHEL system right now
>> >> that triggers when you have a read-only filesystem and can't use file
>> >> locks. I know we ripped out a lot of that code since in upstream but I
>> >> might need to address this differently. I'll followup with my findings.
>> >
>> > I have written a sysV shared memory locking mechanism that using an shm
>> > segment in the case that the filesystem is mounted read-only. I'll
>> > finish the work on the RHEL bug and then forward port it, and post. We
>> > can probably re-introduce this because it works in a read-only context
>> > and can't be abused by a user to prevent module loading because the IPC
>> > shm segment is user-specific. Only downside is (optional, I guess)
>> > dependency on sysV shm. But I don't think we honestly have anyone
>> > wanting to use m-i-t on a system without such things compiled in.
>>
>> Why would we need anything like that? Shouldn't the load syscall
>> serialize all that properly?
>
> It doesn't quite though (in the 2.6.18 kernel I'm focusing on right now,
> upstream has some slight changes in the module loader code since then).
> We only run under stop machine during the actual link and unlink,

I don't think we ever run under stop_machine, only when an error
occurs, we do use it, otherwise we never actual stopping and just
clean up the prepared stop machine thread. It's all protected only by
a mutex, I think.

> so
> there's a possibility for a second modprobe to come along while another
> is running. The existing code upstream was ripped out because there is a
> possibility for the flocking to be abused by a user opening the module
> file and because it's not normally a big deal if a second modprobe comes
> along - worst case the module is already loaded by another instance and
> we will fail harmlessly to load it a second time. Totally no problem.
>
> But you know how these Enterprise distros are ;) there's a harmless
> warning generated in the system kernel log in case that you have
> multiple modprobes running at the same time trying to load a module with
> deps that will fail because the hardware isn't available, and you're on
> a root filesystem without the possibility of the previous locking code
> (still in there) working(!) :P The old code even accounts for this by
> spinning in the procfs reading code, looking for the "Loading" and
> "Unloading" state change. The warning is harmless, but anyway.

Yeah, maybe this condition should be detected and the warning should
be fixed then. :)

> We probably don't care in upstream as a vary contrived case that causes
> no harm isn't enough to justify that locking be re-introduced. But I'm
> sharing mostly for your interest at this point.

Sounds good. I still think that this kind of locking/serialization
belongs into the kernel only. If we need locking in modprobe to make
modprobe reliable, we probably do something wrong.

>> The only thing that should happen is needless work, that the failing
>> syscall tells us to discard -- which is probably more reliable
>
> Right. Except in the case that you are loading a useless module (with a
> dep that will fail) in the first place. In that case there seems to
> still be a rare race (again, this is older 3.3 on an old distro, and
> both the kernel and upstream have changed so I am going to get back to
> seeing how we're affected now) in which we'll manage to convince
> ourselves that one of the deps is loaded (it loads, but fails during
> init because the hardware isn't available, though we'll run the
> scheduler in between after we link it in before calling init, so another
> modprobe might run) long enough to attempt to insert the module itself
> and then fail to find one of the dependent symbols.
>
> Rusty did some fixes to symbol resolution I'm also looking at and this
> might turn out to be fixed there. Anyway, it's all harmless in what
> happens but I thought I'd mention I'm thinking about locking and whether
> we should revive something for contrived issues.

Fine. I just want to be careful for the current version to add
something back that, if needed, likely just works around the real
issue that should be fixed instead.

Thanks,
Kay
--
To unsubscribe from this list: send the line "unsubscribe linux-modules" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux