On Fri, Apr 17, 2020 at 11:07:38AM +0200, David Hildenbrand wrote: > On 17.04.20 10:50, Michael S. Tsirkin wrote: > > On Fri, Apr 17, 2020 at 09:49:03AM +0200, David Hildenbrand wrote: > >> On 17.04.20 08:28, Michael S. Tsirkin wrote: > >>> On Thu, Apr 16, 2020 at 04:52:42PM -0700, Alexander Duyck wrote: > >>>> On Thu, Apr 16, 2020 at 3:13 PM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: > >>>>> > >>>>> On Thu, Apr 16, 2020 at 12:30:38PM -0700, Alexander Duyck wrote: > >>>>>> From: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx> > >>>>>> > >>>>>> If we have free page hinting or page reporting enabled we should disable it > >>>>>> if the pages are poisoned or initialized on free and we cannot notify the > >>>>>> hypervisor. > >>>>>> > >>>>>> Fixes: 5d757c8d518d ("virtio-balloon: add support for providing free page reports to host") > >>>>>> > >>>>>> Signed-off-by: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx> > >>>>> > >>>>> > >>>>> Why not put this logic in the hypervisor? > >>>> > >>>> I did that too. This is to cover the case where somebody is running > >>>> the code prior to my QEMU changes where the page poison feature wasn't > >>>> being enabled. > >>> > >>> > >>> Hmm so that one looks like a QEMU bug (does not expose poison flag). In > >>> the past we just said need to fix the bug where it's found unless the > >>> issue is very widespread and major. Let's assume QEMU learns to always > >>> expose POISON with HINT. Then this configuration (HINT && !POISON) > >>> allows us to detect old QEMU (pre your changes). > >> > >> Don't see why this is a QEMU bug. It's just a feature it does not > >> implement. Perfectly valid. > > > > I'll need to think about this. > > We need to remember that the whole HINT thing is not a mandate for host to > > corrupt memory. It's just some info about page use guest > > gives host. If host corrupts memory it's broken ... > > I don't think that's true, Do you refer to "If host corrupts memory it's broken"? You think that's not true? > and that's not what we currently implement in > the hypervisor for speeding up migration. I still consider > VIRTIO_BALLOON_F_FREE_PAGE_HINT as an alternative technique of > temporarily inflating/deflating. Reporting is like that. But hinting isn't, simply because by the time host gets the hint page may already be in use. Blowing it out unconditionally would lead to easily reproducible guest crashes. > E.g., we don't migrate any of these > pages in the hypervisor, so the content will be zeroed out on the > migration target. Not exactly true. We do a delicate play with the two dirty bits so they *are* migrated sometimes. > If migration fails, the ld content will remain. You > can call that "corrupting memory" - it's exactly what it has been > invented for. Well no, original author repeatedly stated it's "hinting" because page can be in use actually. > > IIRC we decided to glue the semantics of VIRTIO_BALLOON_F_PAGE_POISON to > VIRTIO_BALLOON_F_FREE_PAGE_HINT, which makes in my opinion perfect sense. > > So I propose: > > VIRTIO_BALLOON_F_PAGE_POISON not implemented in the hypervisor: > - Pages inflated/deflated via VIRTIO_BALLOON_F_FREE_PAGE_HINT will > either have the old page content or will be filled with 0. > - This matches what we currently do. > > VIRTIO_BALLOON_F_PAGE_POISON implemented in the hypervisor: > - Pages inflated/deflated via VIRTIO_BALLOON_F_FREE_PAGE_HINT will > either have the old page content or will be filled with poison_val. > - This matches what we should do also during ordinary > inflation/deflation and free page reporting. It's a reasonable option. however ... > Again, nothing is currently broken as we free_page() the pages and don't > care about an eventually changed page content. I don't see how you can say that. ATM on a host without POISON and with HINT, with poison val != 0 and with validation, host can blow a free page away and then guest will detect that as corruption. If guest crashes then either guest or host are broken ;) > It's only relevant once > we ant to change that - and then we can rely on > VIRTIO_BALLOON_F_PAGE_POISON. > >>>> > >>>> The problem is we cannot communicate the full situation to the > >>>> hypervisor without the page poison feature being present. As such I > >>>> would worry about that backfiring on us due to the hypervisor acting > >>>> on incomplete data. > >>> > >>> > >>> I'll try to think about VIRTIO_BALLOON_F_FREE_PAGE_HINT situation > >>> over the weekend. But for the new page reporting, why not > >> > >> I shared my thoughts about VIRTIO_BALLOON_F_FREE_PAGE_HINT in the other > >> thread and think we should simply not care at all for now. > >> > >>> assume host implementation will be sane? > >> > >> I don't think we should enforce having poison support around. See my > >> reply to this mail for an alternative. > > > > OK so you basically say leave this to host to handle? That's more or > > less what I'm saying too. > > Yes, for now. We should at some point change the guest to avoid > re-poisoning/zeroing by not using free_page(). For now, I don't think > there is anything broken, it's just not as efficient as it could get at > this point - tolerable as we don't usually expect our guests to poison > pages or per-initialize them with zero. > > -- > Thanks, > > David / dhildenb _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization