Re: [PATCH] driver-core: devtmpfs - driver core maintained /dev tmpfs

Alan Jenkins <sourcejedi.lkml@xxxxxxxxxxxxxx> · Sat, 2 May 2009 22:41:50 +0100

On 5/2/09, Alan Jenkins <sourcejedi.lkml@xxxxxxxxxxxxxx> wrote:
> On 5/2/09, Kay Sievers <kay.sievers@xxxxxxxx> wrote:
>> On Sat, May 2, 2009 at 09:16, Christoph Hellwig <hch@xxxxxxxxxxxxx>
>> wrote:
>>> >After the rootfs is mounted by the kernel, the
>>>> populated tmpfs is mounted at /dev. In initramfs, it can be moved
>>>> to the manually mounted root filesystem before /sbin/init is
>>>> executed.
>>>
>>> That for example is something that is not acceptable.  We really don't
>>> want the kernel to mess with the initial namespace in such a major way.
>>
>> There is nothing like "mess around", it's not mounted at all, until
>> the kernel mounts the root filesystem at /, then devtmpfs is mounted
>> the first time, and only if it's compiled in because you asked for it.
>> Also, just try:
>>   egrep 'mknod|create_dev' init/*.c
>> and see what we currently do.
>>
>>> Counter-proposal:  Re-introduce a proper mini-devfs.  All nodes in there
>>> are kernel-created and not changeable which sorts out that whole
>>> mess of both drivers and userspace messing with tree topology we had
>>> both in original devfs and this new devtmpfs.  Single-instance so it can
>>> be
>>> populated before it's actually mounted somewhere, that way the kernel
>>> doesn't have to do any policy devicision on where it's mounted.
>>
>> That sounds worse than devtpfs, and does not help for most of the
>> mentioned problems we are trying to solve here.
>
> On a narrow issue: do you really object to moving the "mount dev -t
> devfs2 /dev" into userspace (and therefore giving it a user-visible
> name)??  That would address Cristophs particular objection about
> "messing around" with the initial namespace.  It means I can be 100%
> sure I can boot an old initramfs with this option enabled.  And it
> gives a nice clean way for new initramfs' to test for this feature -
> when they try to mount it, it fails.  It would seem to make for a
> rather smoother migration path.
>
> It shouldn't mean too many more LOC, you're already doing the "single
> instance" thing.

Also, AFAICS this would avoid a memory leak on umount.

In it's original form, if you unmount it, you can't get it back.  But
it doesn't get destroyed either; all the tmpfs nodes just hang around
in limbo, right?

It'd be even nicer if it somehow avoided consuming memory when it
isn't used.  I guess that requires more code, looks more like a "real"
devfs, and as C. says is probably more sane if exported as a read
only.  Hopefully it would make it possible to remove the code
afterwards, but that sounds like even more work.

But is read-only so bad?  You just have to copy it over to a tmpfs and
then mount that on top of /dev.  That's atomic, so it won't interfere
with parallel early init.  I sympathize, devtmpfs is a really neat
hack that does exactly what udev needs.  But you have to admit, it
doesn't fit in _quite_ as well with the kernel status quo.

Thanks
Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html