> -----Original Message----- > From: Daniel P. Berrange [mailto:berrange@xxxxxxxxxx] > Sent: Thursday, May 12, 2016 5:28 PM > To: Mooney, Sean K <sean.k.mooney@xxxxxxxxx> > Cc: libvir-list@xxxxxxxxxx > Subject: Re: adding a new libvirt xml element for File > Descriptor backed memory for use with vhost-user > > On Thu, May 12, 2016 at 04:00:29PM +0000, Mooney, Sean K wrote: > > > > Today it is possible to use Libvirt to spawn a vm without hugepage > > > > memory and a file descriptor backed memdev Via the use of the > > > qemu:commandline element. > > > > > > > > <qemu:commandline> > > > > <qemu:arg value='-object'/> > > > > <qemu:arg value='memory-backend-file,id=mem,size=1024M,mem- > > > path=/var/lib/libvirt/qemu,share=on'/> > > > > <qemu:arg value='-numa'/> > > > > <qemu:arg value='node,memdev=mem'/> > > > > <qemu:arg value='-mem-prealloc'/> > > > > </qemu:commandline> > > > > > > > > I created a proof of concept patch to nova to demonstrate that > > > > this works however to support this usecase in Nova a new xml > > > > element is > > > required. > > > > https://review.openstack.org/#/c/309565/1 > > > > > > > > I would like to propose the introduction of a new subelemnt to > > > > the memorybacking element to request file discrptro backed memory > > > > > > > > <memoryBacking> > > > > <filedescriptor size_mb="1024" path="/var/lib/libvirt/qemu" > > > > prealloc="true" shared="on" /> </memoryBacking> > > > > > > Specifying a size is not required - we already know how big memory > > > must be for the guest. > > > > > > We already have a memAccess='shared' attribute against the <numa> > > > element that is used to determine if the underlying memory should be > > > setup as shared. We could define a further element that lets us > > > control memory access mode for guests without NUMA topology > specified. > > [Mooney, Sean K] hi yes the reason I added the shared attribute was to > > cater for The case of guest without numa topology. For guest with numa > > topology I agree that Using the memAcess='shared' on the cell is > better for consistency with hugepage memory. > > > > > <memoryBacking> > > > <access mode="shared"/> > > > </memoryBacking> > > > > > > For huge pages it seems we unconditionally pass --mem-prealloc. I'm > > > thinking we could perhaps make that configurable via an element > > > > > > > > > <memoryBacking> > > > <allocation mode="immediate|ondemand"/> > > > </memoryBacking> > > > > > > to control use of -mem-prealloc or not. > > [Mooney, Sean K] for the vhost user case the the mem-prealloc is > > required Because you are basically doing dma so you really want memory > to allocated. > > Generically though from a Libvirt point of view I do think It makes > > sense for this To be configurable to allow over subscript of memory > for higher density. > > > > > > So all that remains is a way to request file based backing of RAM. > > > As with huge pages, I think we should hide the actual path from the > user. > > > We should just use /dev/shm as the backing for non-hugepage RAM. For > > > this we could define something like > > > > > > <memoryBacking> > > > <source type="file|anonymous"/> > > > </memoryBacking> > > > > > [Mooney, Sean K] for some reason when I used /dev/shm I could only > boot one instance at a time. > > that was my first choice but maybe we would have to create a file per > instance under /dev/shm to make it work. > > QEMU should create the file itself - its not different to our use of > hugetlbfs in fact. Possibly you hit a limit on amount of memory allowed > to be used via /dev/shm - iirc the mount point tis limited to 50% by > default > > If you use /var/lib/libvirt/ as the location you get a real file backed > by disk, so akin to putting the VM on swap IIUC ! [Mooney, Sean K] That was my initial assumption too however when you use /var/lib/libvirt/ or /dev/shm qemu does not create a file in the directory. What I think is happening is it does not actually create a file and just a file descriptor that is mapped to a memory region. I believe it is merely using the path to determine what the default page size should be when allocating filebacking in memory. This is something that we can look into though. > > > > Putting that all together, to get what you want we'd have > > > > > > <memoryBacking> > > > <source type="file"/> > > > <access mode="shared"/> > > > <allocation mode="immediate"/> > > > </memoryBacking> > > > > > [Mooney, Sean K] > > Yes this seems like it would be a clean way to address this use case. > > Can you guage how small/large of a change this would be. Its been A > > while since I worked with c directly but if you could point me in the > > Right direction in the Libvirt codebase I would be happy to look at > > creating an RFC patch. > > First there's defining the XML extensions - needs > docs/schemas/domaincommon.rng and src/conf/domain_conf.{c,h} to be > changed. > > Then there's wiring up QEMU XML -> ARGV conversion - > src/qemu/qemu_command.c and adding test cases in > tests/qemuxml2argvtest.c > > > From a nova side assuming Libvirt was extended for this feature should > > I open a blueprint to extend the existing guest memory backing support > > In parallel to the Libvirt implementation or wait until after it is > > support in Libvirt to start the Nova discussion? In either case I > > think we agree that any support in nova Would Depend on Libvirt > > support to be accepted in upstream nova. > > You're going to hit the deadline for approval of Newton specs in Nova > fairly soon, and unless the libvirt impl is done before then, I think it > is unlikely you'd get a spec approved. So by all means work on this in > parallel, but be realistic about chances of approval in Nova for this > cycle. [Mooney, Sean K] actually I was assuming that this would be completed early In Ocata as it required changes in Libvirt first. > > > Regards, > Daniel > -- > |: http://berrange.com -o- > http://www.flickr.com/photos/dberrange/ :| > |: http://libvirt.org -o- http://virt- > manager.org :| > |: http://autobuild.org -o- > http://search.cpan.org/~danberr/ :| > |: http://entangle-photo.org -o- http://live.gnome.org/gtk- > vnc :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list