Re: [RFC] [PATCH v2 1/6] add configure option --with-fuse for libvirt

"Daniel P. Berrange" <berrange@xxxxxxxxxx> · Wed, 5 Sep 2012 13:42:40 +0100

On Wed, Sep 05, 2012 at 05:41:40PM +0800, Gao feng wrote:
> Hi Daniel & Glauber
> 
> 于 2012年07月31日 17:27, Daniel P. Berrange 写道:
> > Hi Gao,
> > 
> > I'm wondering if you are planning to attend the Linux Plumbers Conference
> > in San Diego at the end of August ?  Glauber is going to be giving a talk
> > on precisely the subject of virtualizing /proc in containers which is
> > exactly what your patch is looking at
> > 
> >   https://blueprints.launchpad.net/lpc/+spec/lpc2012-cont-proc
> > 
> > I'll review your patches now, but I think I'd like to wait to hear what
> > Glauber talks about at LPC before we try to merge this support in libvirt,
> > so we have an broadly agreed long term strategy for /proc between all the
> > interested userspace & kernel guys.
> 
> I did not attend the LPC,so can you tell me what's the situation of the
> /proc virtualization?
> 
> I think maybe we should just apply this patchset first,and wait for somebody
> sending patches to implement /proc virtualization.

So there were three main approaches discussed

 1. FUSE based /proc + a real hidden /.proc. The FUSE /proc provides custom
    handling of various files like meminfo, otherwise forwards I/O requests
    through to the hidden /.proc files. This was the original proof of
    concept.

 2. One FUSE filesystem for all containers + a real /proc. Bind mount files
    from the FUSE filesystem into the container's /proc. This is what Glauber
    has done.

 3. One FUSE filesystem per container + a real /proc. Bind mount files from
    the FUSE filesystem into the container's /proc. This is what your patch
    is doing

Options 2 & 3 have a clear a win over option 1 in efficiency terms, since
they avoid doubling the I/O required for the majority of files.

Glaubar thinks it is perferrable to have a single FUSE filesystem that
has one sub-directory for each container. Then bind mount the appropriate
sub dir into each container.

I kinda like the way you have done things, having a private FUSE filesystem
per container, for security reasons. By having the FUSE backend be part of
the libvirt_lxc process we have strictly isolated each containers' environment.

If we wanted a single shared FUSE for all containers, we'd need to have some
single shared daemon to maintain it. This could not be libvirtd itself, since
we need the containers & their filesystems to continue to work when libvirtd
itself is not running. We could introduce a separate libvirt_fused which
provided a shared filesystem, but this still has the downside that any
flaw in its impl could provide a way for one container to attack another
container

So in summary, I think your patches which add a private FUSE per container
in libvirt_lxc appear to be the best option at this time.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list