On 09/06/2012 05:53 AM, Gao feng wrote: > 于 2012年09月05日 20:42, Daniel P. Berrange 写道: >> On Wed, Sep 05, 2012 at 05:41:40PM +0800, Gao feng wrote: >>> Hi Daniel & Glauber >>> >>> 于 2012年07月31日 17:27, Daniel P. Berrange 写道: >>>> Hi Gao, >>>> >>>> I'm wondering if you are planning to attend the Linux Plumbers Conference >>>> in San Diego at the end of August ? Glauber is going to be giving a talk >>>> on precisely the subject of virtualizing /proc in containers which is >>>> exactly what your patch is looking at >>>> >>>> https://blueprints.launchpad.net/lpc/+spec/lpc2012-cont-proc >>>> >>>> I'll review your patches now, but I think I'd like to wait to hear what >>>> Glauber talks about at LPC before we try to merge this support in libvirt, >>>> so we have an broadly agreed long term strategy for /proc between all the >>>> interested userspace & kernel guys. >>> >>> I did not attend the LPC,so can you tell me what's the situation of the >>> /proc virtualization? >>> >>> I think maybe we should just apply this patchset first,and wait for somebody >>> sending patches to implement /proc virtualization. >> >> So there were three main approaches discussed >> >> 1. FUSE based /proc + a real hidden /.proc. The FUSE /proc provides custom >> handling of various files like meminfo, otherwise forwards I/O requests >> through to the hidden /.proc files. This was the original proof of >> concept. >> >> 2. One FUSE filesystem for all containers + a real /proc. Bind mount files >> from the FUSE filesystem into the container's /proc. This is what Glauber >> has done. >> >> 3. One FUSE filesystem per container + a real /proc. Bind mount files from >> the FUSE filesystem into the container's /proc. This is what your patch >> is doing >> >> Options 2 & 3 have a clear a win over option 1 in efficiency terms, since >> they avoid doubling the I/O required for the majority of files. >> >> Glaubar thinks it is perferrable to have a single FUSE filesystem that >> has one sub-directory for each container. Then bind mount the appropriate >> sub dir into each container. >> >> I kinda like the way you have done things, having a private FUSE filesystem >> per container, for security reasons. By having the FUSE backend be part of >> the libvirt_lxc process we have strictly isolated each containers' environment. >> >> If we wanted a single shared FUSE for all containers, we'd need to have some >> single shared daemon to maintain it. This could not be libvirtd itself, since >> we need the containers & their filesystems to continue to work when libvirtd >> itself is not running. We could introduce a separate libvirt_fused which >> provided a shared filesystem, but this still has the downside that any >> flaw in its impl could provide a way for one container to attack another >> container > > Agree,if we choose the option 2,we have to organize the sub-directory for each > container in fuse,it will make fuse filesystem complicated. > So, according to Daniel Lezcano, that tried it once, FUSE is very fork intensive, and having one mount per-container would lead to bad performance. But I have to admit I have never measured it myself. I would be curious to see any numbers for a large deployment, to see if that complication is worth the gain. -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list