On Mon, Feb 03, 2020 at 12:32:05PM -0800, Tyler Sanderson wrote: > There were apparently good reasons for moving away from OOM notifier callback: > https://lkml.org/lkml/2018/7/12/314 > https://lkml.org/lkml/2018/8/2/322 > > In particular the OOM notifier is worse than the shrinker because: > > 1. It is last-resort, which means the system has already gone through heroics > to prevent OOM. Those heroic reclaim efforts are expensive and impact > application performance. > 2. It lacks understanding of NUMA or other OOM constraints. > 3. It has a higher potential for bugs due to the subtlety of the callback > context. > > Given the above, I think the shrinker API certainly makes the most sense _if_ > the balloon size is static. In that case memory should be reclaimed from the > balloon early and proportionally to balloon size, which the shrinker API > achieves. OK that sounds like VIRTIO_BALLOON_F_FREE_PAGE_HINT then. > However, if the balloon is inflating and intentionally causing memory pressure > then this results in the inefficiency pointed out earlier. And that sounds like VIRTIO_BALLOON_F_DEFLATE_ON_OOM. > If the balloon is inflating but not causing memory pressure then there is no > problem with either API. > > This suggests another route: rather than cause memory pressure to shrink the > page cache, the balloon could issue the equivalent of "echo 3 > /proc/sys/vm/ > drop_caches". > Of course ideally, we want to be more fine grained than "drop everything". We > really want an API that says "drop everything that hasn't been accessed in the > last 5 minutes". > > This would eliminate the need for the balloon to cause memory pressure at all > which avoids the inefficiency in question. Furthermore, this pairs nicely with > the FREE_PAGE_HINT feature. Well we still do have a regression. So we probably should revert for now, and separately look for better solutions. > > On Mon, Feb 3, 2020 at 9:04 AM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: > > On Mon, Feb 03, 2020 at 05:34:20PM +0100, David Hildenbrand wrote: > > On 03.02.20 17:18, Alexander Duyck wrote: > > > On Mon, 2020-02-03 at 08:11 -0500, Michael S. Tsirkin wrote: > > >> On Thu, Jan 30, 2020 at 11:59:46AM -0800, Tyler Sanderson wrote: > > >>> > > >>> On Thu, Jan 30, 2020 at 7:31 AM Wang, Wei W <wei.w.wang@xxxxxxxxx> > wrote: > > >>> > > >>> On Thursday, January 30, 2020 11:03 PM, David Hildenbrand wrote: > > >>> > On 29.01.20 20:11, Tyler Sanderson wrote: > > >>> > > > > >>> > > > > >>> > > On Wed, Jan 29, 2020 at 2:31 AM David Hildenbrand < > david@xxxxxxxxxx > > >>> > > <mailto:david@xxxxxxxxxx>> wrote: > > >>> > > > > >>> > > On 29.01.20 01:22, Tyler Sanderson via Virtualization > wrote: > > >>> > > > A primary advantage of virtio balloon over other memory > reclaim > > >>> > > > mechanisms is that it can pressure the guest's page > cache into > > >>> > > shrinking. > > >>> > > > > > >>> > > > However, since the balloon driver changed to using the > shrinker > > >>> API > > >>> > > > > > >>> > > > > >>> > <https://github.com/torvalds/linux/commit/ > 71994620bb25a8b109388fefa9 > > >>> > e99a28e355255a#diff-fd202acf694d9eba19c8c64da3e480c9> this > > >>> > > > use case has become a bit more tricky. I'm wondering > what the > > >>> > intended > > >>> > > > device implementation is. > > >>> > > > > > >>> > > > When inflating the balloon against page cache (i.e. no > free > > >>> memory > > >>> > > > remains) vmscan.c will both shrink page cache, but also > invoke > > >>> the > > >>> > > > shrinkers -- including the balloon's shrinker. So the > balloon > > >>> driver > > >>> > > > allocates memory which requires reclaim, vmscan gets > this memory > > >>> > by > > >>> > > > shrinking the balloon, and then the driver adds the > memory back > > >>> to > > >>> > the > > >>> > > > balloon. Basically a busy no-op. > > >>> > > >>> Per my understanding, the balloon allocation won’t invoke > shrinker as > > >>> __GFP_DIRECT_RECLAIM isn't set, no? > > >>> > > >>> I could be wrong about the mechanism, but the device sees lots of > activity on > > >>> the deflate queue. The balloon is being shrunk. And this only starts > once all > > >>> free memory is depleted and we're inflating into page cache. > > >> > > >> So given this looks like a regression, maybe we should revert the > > >> patch in question 71994620bb25 ("virtio_balloon: replace oom notifier > with shrinker") > > >> Besides, with VIRTIO_BALLOON_F_FREE_PAGE_HINT > > >> shrinker also ignores VIRTIO_BALLOON_F_MUST_TELL_HOST which isn't nice > > >> at all. > > >> > > >> So it looks like all this rework introduced more issues than it > > >> addressed ... > > >> > > >> I also CC Alex Duyck for an opinion on this. > > >> Alex, what do you use to put pressure on page cache? > > > > > > I would say reverting probably makes sense. I'm not sure there is much > > > value to having a shrinker running deflation when you are actively > trying > > > to increase the balloon. It would make more sense to wait until you are > > > actually about to start hitting oom. > > > > I think the shrinker makes sense for free page hinting feature > > (everything on free_page_list). > > > > So instead of only reverting, I think we should split it up and always > > register the shrinker for VIRTIO_BALLOON_F_FREE_PAGE_HINT and the OOM > > notifier (as before) for VIRTIO_BALLOON_F_MUST_TELL_HOST. > > OK ... I guess that means we need to fix shrinker to take > VIRTIO_BALLOON_F_MUST_TELL_HOST into account correctly. > Hosts ignore it at the moment but it's a fragile thing > to do what it does and ignore used buffers. > > > (Of course, adapting what is being done in the shrinker and in the OOM > > notifier) > > > > -- > > Thanks, > > > > David / dhildenb > >