Hi Manjunath, On 12/04/2018 10:49 AM, Manjunath Patil wrote: > On 12/3/2018 6:16 PM, Boris Ostrovsky wrote: > >> On 12/3/18 8:14 PM, Dongli Zhang wrote: >>> Hi Boris, >>> >>> On 12/04/2018 12:07 AM, Boris Ostrovsky wrote: >>>> On 12/2/18 3:31 PM, Manjunath Patil wrote: >>>>> On 11/30/2018 2:33 PM, Boris Ostrovsky wrote: >>>>> >>>>>> On 11/30/18 4:49 PM, Manjunath Patil wrote: >>>>>>> Thank you Boris for your comments. I removed faulty email of mine. >>>>>>> >>>>>>> replies inline. >>>>>>> On 11/30/2018 12:42 PM, Boris Ostrovsky wrote: >>>>>>>> On 11/29/18 12:17 AM, Manjunath Patil wrote: >>>>>>>>> Hi, >>>>>>>>> Feel free to suggest/comment on this. >>>>>>>>> >>>>>>>>> I am trying to do the following at dst during the migration now. >>>>>>>>> 1. Dont clear the old rinfo in blkif_free(). Instead just clean it. >>>>>>>>> 2. Store the old rinfo and nr_rings into temp variables in >>>>>>>>> negotiate_mq() >>>>>>>>> 3. let nr_rings get re-calculated based on backend data >>>>>>>>> 4. try allocating new memory based on new nr_rings >>>>>>>> Since I suspect number of rings will likely be the same why not reuse >>>>>>>> the rings in the common case? >>>>>>> I thought attaching devices will be more often than migration. Hence >>>>>>> did not want add to an extra check for >>>>>>> - if I am inside migration code path and >>>>>>> - if new nr_rings is equal to old nr_rings or not >>>>>>> >>>>>>> Sure addition of such a thing would avoid the memory allocation >>>>>>> altogether in migration path, >>>>>>> but it would add a little overhead for normal device addition. >>>>>>> >>>>>>> Do you think its worth adding that change? >>>>>> IMO a couple of extra checks are not going to make much difference. >>>>> I will add this change >>>>>> I wonder though --- have you actually seen the case where you did fail >>>>>> allocation and changes provided in this patch made things work? I am >>>>>> asking because right after negotiate_mq() we will call setup_blkring() >>>>>> and it will want to allocate bunch of memory. A failure there is fatal >>>>>> (to ring setup). So it seems to me that you will survive negotiate_mq() >>>>>> but then will likely fail soon after. >>>>> I have noticed the ENOMEM insise negotiate_mq() on ct machine. When I >>>>> included my patch, I manually triggered the ENOMEM using a debug flag. >>>>> The patch works for ENOMEM inside negotiate_mq(). >>>>> >>>>> As you mentioned, if we really hit the ENOMEM in negotiate_mq(), we >>>>> might hit it in setup_blkring() as well. >>>>> We should add the similar change to blkif_sring struct as well. >>>> Won't you have a similar issue with other frontends, say, netfront? >>> I think the kmalloc is failed not because of OOM. >>> >>> In fact, the size of "blkfront_ring_info" is large. When domU have 4 >>> queues/rings, the size of 4 blkfront_ring_info can be about 300+ KB. >>> >>> There is chance that kmalloc() 300+ KB would fail. >>> >>> >>> About netfront, to kmalloc() 8 'struct netfront_queue' seems consumes <70 KB? >> TBH these look like comparable sizes to me. I am not convinced that >> these changes will make a difference. If the number of rings on source >> and destination were the same I'd absolutely agree with this patch but >> since you are trying to handle different sizes the code becomes somewhat >> more complex, and I am not sure it's worth it. (Can you actually give me >> an example of when we can expect number of rings to change during >> migration?) >> >> But others may think differently. > Hi Boris, > I think allocation of 300KB chunk[order 7 allocation] is more likely to fail > than 70KB[order 5] especially under memory pressure. > If it comes to that, I think we should fix this too. > > The no.of rings in most cases remain 4 thanks to xen_blkif_max_queues module > parameter. > If the src host has allocated less than 4[may be vpcu given to this dom0 were > less than 4], then we can expect the dst to allocate more than src side and vice > versa. xen_blkif_max_queues is tunable so the size to kmalloc() would be larger when both xen_blkif_max_queues and dom0 vcpu are large. Dongli Zhang