On Wed, Sep 06, 2017 at 01:35:45PM +0200, Michal Privoznik wrote: > On 09/05/2017 04:07 PM, Daniel P. Berrange wrote: > > On Tue, Sep 05, 2017 at 03:59:09PM +0200, Michal Privoznik wrote: > >> On 07/28/2017 10:59 AM, Daniel P. Berrange wrote: > >>> On Fri, Jul 28, 2017 at 10:45:21AM +0200, Michal Privoznik wrote: > >>>> On 07/27/2017 03:50 PM, Daniel P. Berrange wrote: > >>>>> On Thu, Jul 27, 2017 at 02:11:25PM +0200, Michal Privoznik wrote: > >>>>>> Dear list, > >>>>>> > >>>>>> there is the following bug [1] which I'm not quite sure how to grasp. So > >>>>>> there is this application/infrastructure called Kove [2] that allows you > >>>>>> to have memory for your application stored on a distant host in network > >>>>>> and basically fetch needed region on pagefault. Now imagine that > >>>>>> somebody wants to use it for backing up domain memory. However, the way > >>>>>> that the tool works is it has some kernel module and then some userland > >>>>>> binary that is fed with the path of the mmaped file. I don't know all > >>>>>> the details, but the point is, in order to let users use this we need to > >>>>>> expose the paths for mem-path for the guest memory. I know we did not > >>>>>> want to do this in the past, but now it looks like we don't have a way > >>>>>> around it, do we? > >>>>> > >>>>> We don't want to expose the concept of paths in the XML because this is > >>>>> a linux specific way to configure hugepages / shared memory. So we hide > >>>>> the particular path used in the internal impl of the QEMU driver, and > >>>>> or via the qemu.conf global config file. I don't really want to change > >>>>> that approach, particularly if the only reason is to integrate with a > >>>>> closed source binary like Kove. > >>>> > >>>> Yep, I agree with that. However, if you read the discussion in the > >>>> linked bug you'll find that they need to know what file in the > >>>> memory_backing_dir (from qemu.conf) corresponds to which domain. The > >>>> reported suggested using UUID based filenames, which I fear is not > >>>> enough because one can have multiple <memory type='dimm'/> -s configured > >>>> for their domain. But I guess we could go with: > >>>> > >>>> ${memory_backing_dir}/${domName} for generic memory > >>>> ${memory_backing_dir}/${domName}_N for Nth <memory/> > >>> > >>> This feels like it is going to lead to hell when you add in memory > >>> hotplug/unplug, with inevitable races. > >>> > >>>> BTW: IIUC they want predictable names because they need to create the > >>>> files before spawning qemu so that they are picked by qemu instead of > >>>> using temporary names. > >>> > >>> I would like to know why they even need to associate particular memory > >>> files with particular QEMU processes. eg if they're just exposing a > >>> new type of tmpfs filesystem from the kernel why does it matter what > >>> each file is used for. > >> > >> This might get you answer: > >> > >> https://bugzilla.redhat.com/show_bug.cgi?id=1461214#c4 > >> > >> So the way I understand it is that they will create the files, and > >> provide us with paths. So luckily, we don't have to make up the paths on > >> our own. > > > > IOW it is pretending to be tmpfs except it is not behaving like tmpfs. > > This doesn't really make me any more inclined to support this closed > > source stuff in libvirt. > > Yeah, that's my feeling too. So, what about the following: let's assume > they will fix their code so that it is proper tmpfs. Libvirt can then > behave to it just like it is already doing so for hugetlbfs. For us > it'll be just yet another type of hugepages. I mean, for hugepages we > already create /hupages/mount/point/libvirt/$domain per each domain so > the separation is there (even though this is considered internal impl), > since it would be a proper tmpfs they can see the pid of qemu which is > trying to mmap() (and take the name or whatever unique ID they want from > there). Yep, we can at least make a reasonable guarantee that all files belonging to a single QEMU process will always be within the same sub-directory. This allows the kmod to distinguish 2 files owned by separate VMs, from 2 files owned by the same VM and do what's needed. I don't see why it would need to care about naming conventions beyond the layout. > I guess what I'm trying to ask is if it was proper tmpfs, we would be > okay with it, wouldn't we? If it is indistinguishable from tmpfs/hugetlbfs from libvirt's POV, we should be fine - at most you would need /etc/libvirt/qemu.conf change to explicitly point at the custom mount point if libvirt doesn't auto-detect the right one. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list