Re: [REVIEW][PATCH 3/3] vfs: Fix a regression in mounting proc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Quoting Andy Lutomirski (luto@xxxxxxxxxxxxxx):
> On Wed, Nov 27, 2013 at 12:07 PM, Eric W. Biederman
> <ebiederm@xxxxxxxxxxxx> wrote:
> > ebiederm@xxxxxxxxxxxx (Eric W. Biederman) writes:
> >
> >> Oleg Nesterov <oleg@xxxxxxxxxx> writes:
> >>
> >>> Just to avoid the possible confusion, let me repeat that the fix itsef
> >>> looks "obviously fine" to me, "i_nlink != 2" looks obviously wrong.
> >>>
> >>> I am not arguing with this patch, I am just trying to understand this
> >>> logic.
> >>>
> >>> On 11/27, Eric W. Biederman wrote:
> >>>>
> >>>> [... snip ...]
> >>>
> >>> Thanks a lot.
> >>>
> >>>> For the real concern about jail environments where proc and sysfs are
> >>>> not mounted at all a fs_visible check is all that is really required,
> >>>
> >>> this is what I can't understand...
> >>>
> >>> Lets ignore the implementation details. Suppose that proc was never
> >>> mounted. Then "mount -t proc" should fail after CLONE_NEWUSER | NEWNS?
> >>
> >> Yes.
> >
> > Well strictly speaking it should fail after CLONE_NEWUSER | NEWNS | NEWPID.
> > If proc was never mounted.
> >
> > Fresh mounts of proc are not allowed unless you have also created the
> > pid namespace.  With just CLONE_NEWUSER | NEWNS you are limited to bind
> > mounts.
> >
> > Has this cleared up the confusion?
> >
> > Eric
> >
> 
> This is all obnoxiously complicated.  I wonder if we can do (a lot)
> better by allowing a "pid-only" variant of proc to be mounted.  It
> should contain:
> 
>  - All the pid directories
>  - /proc/self, /proc/net, and /proc/mounts (but possibly not
> /proc/PID/net -- that's a weird interface IMO and isn't really related
> to the pid)
>  - keys key-users (wtf is up with that interface, though -- those
> files are way too magical)
>  - cpuinfo, version, and maybe other informational things (crypto?)
>  - loadavg, perhaps
> 
> I wonder it would be possible to boot a reasonable container with a
> heavily limited /proc like that.

Should be possible.  And heck, maybe some of the values could then
be virtualized :)  cmdline could point to the container init's
cmdline;  cpuinfo and loadavg and meminfo be filtered through
cgroupfs.

-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux