Wei Wang wrote: > On 10/09/2017 11:20 PM, Michael S. Tsirkin wrote: > > On Sat, Sep 30, 2017 at 12:05:52PM +0800, Wei Wang wrote: > >> +static inline void xb_set_page(struct virtio_balloon *vb, > >> + struct page *page, > >> + unsigned long *pfn_min, > >> + unsigned long *pfn_max) > >> +{ > >> + unsigned long pfn = page_to_pfn(page); > >> + > >> + *pfn_min = min(pfn, *pfn_min); > >> + *pfn_max = max(pfn, *pfn_max); > >> + xb_preload(GFP_KERNEL); > >> + xb_set_bit(&vb->page_xb, pfn); > >> + xb_preload_end(); > >> +} > >> + > > So, this will allocate memory > > > > ... > > > >> @@ -198,9 +327,12 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num) > >> struct page *page; > >> struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info; > >> LIST_HEAD(pages); > >> + bool use_sg = virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_SG); > >> + unsigned long pfn_max = 0, pfn_min = ULONG_MAX; > >> > >> - /* We can only do one array worth at a time. */ > >> - num = min(num, ARRAY_SIZE(vb->pfns)); > >> + /* Traditionally, we can only do one array worth at a time. */ > >> + if (!use_sg) > >> + num = min(num, ARRAY_SIZE(vb->pfns)); > >> > >> mutex_lock(&vb->balloon_lock); > >> /* We can't release more pages than taken */ > > And is sometimes called on OOM. > > > > > > I suspect we need to > > > > 1. keep around some memory for leak on oom > > > > 2. for non oom allocate outside locks > > > > > > I think maybe we can optimize the existing balloon logic, which could > remove the big balloon lock: > > It would not be necessary to have the inflating and deflating run at the > same time. > For example, 1st request to inflate 7G RAM, when 1GB has been given to > the host (so 6G left), the > 2nd request to deflate 5G is received. Instead of waiting for the 1st > request to inflate 6G and then > continuing with the 2nd request to deflate 5G, we can do a diff (6G to > inflate - 5G to deflate) immediately, > and got 1G to inflate. In this way, all that driver will do is to simply > inflate another 1G. > > Same for the OOM case: when OOM asks for 1G, while inflating 5G is in > progress, then the driver can > deduct 1G from the amount that needs to inflate, and as a result, it > will inflate 4G. > > In this case, we will never have the inflating and deflating task run at > the same time, so I think it is > possible to remove the lock, and therefore, we will not have that > deadlock issue. > > What would you guys think? What is balloon_lock at virtballoon_migratepage() for? e22504296d4f64fb "virtio_balloon: introduce migration primitives to balloon pages" f68b992bbb474641 "virtio_balloon: fix race by fill and leak" And even if we could remove balloon_lock, you still cannot use __GFP_DIRECT_RECLAIM at xb_set_page(). I think you will need to use "whether it is safe to wait" flag from "[PATCH] virtio: avoid possible OOM lockup at virtballoon_oom_notify()" .