On Wed, Jun 17, 2015 at 12:37:42PM +0200, Igor Mammedov wrote: > On Wed, 17 Jun 2015 12:11:09 +0200 > "Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote: > > > On Wed, Jun 17, 2015 at 10:54:21AM +0200, Igor Mammedov wrote: > > > On Wed, 17 Jun 2015 09:39:06 +0200 > > > "Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote: > > > > > > > On Wed, Jun 17, 2015 at 09:28:02AM +0200, Igor Mammedov wrote: > > > > > On Wed, 17 Jun 2015 08:34:26 +0200 > > > > > "Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote: > > > > > > > > > > > On Wed, Jun 17, 2015 at 12:00:56AM +0200, Igor Mammedov wrote: > > > > > > > On Tue, 16 Jun 2015 23:14:20 +0200 > > > > > > > "Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > On Tue, Jun 16, 2015 at 06:33:37PM +0200, Igor Mammedov wrote: > > > > > > > > > since commit > > > > > > > > > 1d4e7e3 kvm: x86: increase user memory slots to 509 > > > > > > > > > > > > > > > > > > it became possible to use a bigger amount of memory > > > > > > > > > slots, which is used by memory hotplug for > > > > > > > > > registering hotplugged memory. > > > > > > > > > However QEMU crashes if it's used with more than ~60 > > > > > > > > > pc-dimm devices and vhost-net since host kernel > > > > > > > > > in module vhost-net refuses to accept more than 65 > > > > > > > > > memory regions. > > > > > > > > > > > > > > > > > > Increase VHOST_MEMORY_MAX_NREGIONS from 65 to 509 > > > > > > > > > > > > > > > > It was 64, not 65. > > > > > > > > > > > > > > > > > to match KVM_USER_MEM_SLOTS fixes issue for vhost-net. > > > > > > > > > > > > > > > > > > Signed-off-by: Igor Mammedov <imammedo@xxxxxxxxxx> > > > > > > > > > > > > > > > > Still thinking about this: can you reorder this to > > > > > > > > be the last patch in the series please? > > > > > > > sure > > > > > > > > > > > > > > > > > > > > > > > Also - 509? > > > > > > > userspace memory slots in terms of KVM, I made it match > > > > > > > KVM's allotment of memory slots for userspace side. > > > > > > > > > > > > Maybe KVM has its reasons for this #. I don't see > > > > > > why we need to match this exactly. > > > > > np, I can cap it at safe 300 slots but it's unlikely that it > > > > > would take cut off 1 extra hop since it's capped by QEMU > > > > > at 256+[initial fragmented memory] > > > > > > > > But what's the point? We allocate 32 bytes per slot. > > > > 300*32 = 9600 which is more than 8K, so we are doing > > > > an order-3 allocation anyway. > > > > If we could cap it at 8K (256 slots) that would make sense > > > > since we could avoid wasting vmalloc space. > > > 256 is amount of hotpluggable slots and there is no way > > > to predict how initial memory would be fragmented > > > (i.e. amount of slots it would take), if we guess wrong > > > we are back to square one with crashing userspace. > > > So I'd stay consistent with KVM's limit 509 since > > > it's only limit, i.e. not actual amount of allocated slots. > > > > > > > I'm still not very happy with the whole approach, > > > > giving userspace ability allocate 4 whole pages > > > > of kernel memory like this. > > > I'm working in parallel so that userspace won't take so > > > many slots but it won't prevent its current versions > > > crashing due to kernel limitation. > > > > Right but at least it's not a regression. If we promise userspace to > > support a ton of regions, we can't take it back later, and I'm concerned > > about the memory usage. > > > > I think it's already safe to merge the binary lookup patches, and maybe > > cache and vmalloc, so that the remaining patch will be small. > it isn't regression with switching to binary search and increasing > slots to 509 either performance wise it's more on improvment side. > And I was thinking about memory usage as well, that's why I've dropped > faster radix tree in favor of more compact array, can't do better > on kernel side of fix. > > Yes we will give userspace to ability to use more slots/and lock up > more memory if it's not able to consolidate memory regions but > that leaves an option for user to run guest with vhost performance > vs crashing it at runtime. Crashing is entirely QEMU's own doing in not handling the error gracefully. > > userspace/targets that could consolidate memory regions should > do so and I'm working on that as well but that doesn't mean > that users shouldn't have a choice. It's a fairly unusual corner case, I'm not yet convinced we need to quickly add support to it when just waiting a bit longer will get us an equivalent (or even more efficient) fix in userspace. > So far it's kernel limitation and this patch fixes crashes > that users see now, with the rest of patches enabling performance > not to regress. When I say regression I refer to an option to limit the array size again after userspace started using the larger size. > > > > > > > > > > > > > I think if we are changing this, it'd be nice to > > > > > > > > create a way for userspace to discover the support > > > > > > > > and the # of regions supported. > > > > > > > That was my first idea before extending KVM's memslots > > > > > > > to teach kernel to tell qemu this number so that QEMU > > > > > > > at least would be able to check if new memory slot could > > > > > > > be added but I was redirected to a more simple solution > > > > > > > of just extending vs everdoing things. > > > > > > > Currently QEMU supports upto ~250 memslots so 509 > > > > > > > is about twice high we need it so it should work for near > > > > > > > future > > > > > > > > > > > > Yes but old kernels are still around. Would be nice if you > > > > > > can detect them. > > > > > > > > > > > > > but eventually we might still teach kernel and QEMU > > > > > > > to make things more robust. > > > > > > > > > > > > A new ioctl would be easy to add, I think it's a good > > > > > > idea generally. > > > > > I can try to do something like this on top of this series. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --- > > > > > > > > > drivers/vhost/vhost.c | 2 +- > > > > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c > > > > > > > > > index 99931a0..6a18c92 100644 > > > > > > > > > --- a/drivers/vhost/vhost.c > > > > > > > > > +++ b/drivers/vhost/vhost.c > > > > > > > > > @@ -30,7 +30,7 @@ > > > > > > > > > #include "vhost.h" > > > > > > > > > > > > > > > > > > enum { > > > > > > > > > - VHOST_MEMORY_MAX_NREGIONS = 64, > > > > > > > > > + VHOST_MEMORY_MAX_NREGIONS = 509, > > > > > > > > > VHOST_MEMORY_F_LOG = 0x1, > > > > > > > > > }; > > > > > > > > > > > > > > > > > > -- > > > > > > > > > 1.8.3.1 > > -- > > To unsubscribe from this list: send the line "unsubscribe kvm" in > > the body of a message to majordomo@xxxxxxxxxxxxxxx > > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html