On 28.01.19 21:19, Michal Hocko wrote: > On Mon 28-01-19 21:02:52, David Hildenbrand wrote: >> On 28.01.19 17:04, David Hildenbrand wrote: >>> While debugging some crashes related to virtio-balloon deflation that >>> happened under the old balloon migration code, I stumbled over a race >>> that still exists today. >>> >>> What we experienced: >>> >>> drivers/virtio/virtio_balloon.c:release_pages_balloon(): >>> - WARNING: CPU: 13 PID: 6586 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0 >>> - list_del corruption. prev->next should be ffffe253961090a0, but was dead000000000100 >>> >>> Turns out after having added the page to a local list when dequeuing, >>> the page would suddenly be moved to an LRU list before we would free it >>> via the local list, corrupting both lists. So a page we own and that is >>> !LRU was moved to an LRU list. >>> >>> In __unmap_and_move(), we lock the old and newpage and perform the >>> migration. In case of vitio-balloon, the new page will become >>> movable, the old page will no longer be movable. >>> >>> However, after unlocking newpage, there is nothing stopping the newpage >>> from getting dequeued and freed by virtio-balloon. This >>> will result in the newpage >>> 1. No longer having PageMovable() >>> 2. Getting moved to the local list before finally freeing it (using >>> page->lru) >>> >>> Back in the migration thread in __unmap_and_move(), we would after >>> unlocking the newpage suddenly no longer have PageMovable(newpage) and >>> will therefore call putback_lru_page(newpage), modifying page->lru >>> although that list is still in use by virtio-balloon. >>> >>> To summarize, we have a race between migrating the newpage and checking >>> for PageMovable(newpage). Instead of checking PageMovable(newpage), we >>> can simply rely on is_lru of the original page. >>> >>> Looks like this was introduced by d6d86c0a7f8d ("mm/balloon_compaction: >>> redesign ballooned pages management"), which was backported up to 3.12. >>> Old compaction code used PageBalloon() via -_is_movable_balloon_page() >>> instead of PageMovable(), however with the same semantics. >>> >>> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> >>> Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> >>> Cc: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> >>> Cc: Michal Hocko <mhocko@xxxxxxxx> >>> Cc: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> >>> Cc: Jan Kara <jack@xxxxxxx> >>> Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx> >>> Cc: Dominik Brodowski <linux@xxxxxxxxxxxxxxxxxxxx> >>> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> >>> Cc: Vratislav Bendel <vbendel@xxxxxxxxxx> >>> Cc: Rafael Aquini <aquini@xxxxxxxxxx> >>> Cc: Konstantin Khlebnikov <k.khlebnikov@xxxxxxxxxxx> >>> Cc: Minchan Kim <minchan@xxxxxxxxxx> >>> Cc: stable@xxxxxxxxxxxxxxx # 3.12+ >>> Fixes: d6d86c0a7f8d ("mm/balloon_compaction: redesign ballooned pages management") >>> Reported-by: Vratislav Bendel <vbendel@xxxxxxxxxx> >>> Acked-by: Michal Hocko <mhocko@xxxxxxxx> >>> Acked-by: Rafael Aquini <aquini@xxxxxxxxxx> >>> Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> >>> --- >>> mm/migrate.c | 6 ++++-- >>> 1 file changed, 4 insertions(+), 2 deletions(-) >>> >>> diff --git a/mm/migrate.c b/mm/migrate.c >>> index 4512afab46ac..31e002270b05 100644 >>> --- a/mm/migrate.c >>> +++ b/mm/migrate.c >>> @@ -1135,10 +1135,12 @@ static int __unmap_and_move(struct page *page, struct page *newpage, >>> * If migration is successful, decrease refcount of the newpage >>> * which will not free the page because new page owner increased >>> * refcounter. As well, if it is LRU page, add the page to LRU >>> - * list in here. >>> + * list in here. Don't rely on PageMovable(newpage), as that could >>> + * already have changed after unlocking newpage (e.g. >>> + * virtio-balloon deflation). >>> */ >>> if (rc == MIGRATEPAGE_SUCCESS) { >>> - if (unlikely(__PageMovable(newpage))) >>> + if (unlikely(!is_lru)) >>> put_page(newpage); >>> else >>> putback_lru_page(newpage); >>> >> >> Vratislav just pointed out that this issue should not happen on upstream >> as __PageMovable(newpage) will still return true even after >> __ClearPageMovable(newpage). Only PageMovable(newpage) would actually >> return false. >> >> (not sure if I am happy about this, this is horribly confusing and >> complicated) > > It is confusing as hell! __ClearPageMovable is a misnomer and I have to > admit I have misread the implementation to actually ~PAGE_MAPPING_MOVABLE. > >> I am not 100% sure yet, but I guess Vratislav is right. So it was >> effectively fixed by >> >> b1123ea6d3b3 ("mm: balloon: use general non-lru movable page feature"), >> which checks for __PageMovable(newpage) instead of >> __is_movable_balloon_page(newpage). > > So this is not just a clean up. Sigh! > >> Anybody wanting to fix stable kernels either has to backport something >> proposed in this patch or b1123ea6d3b3. > > I think we should go with a simple patch for stable so this patch sounds > like a good thing. *PageMovable thingy needs a much better documentation > and ideally a cleaner implementation as well. The current state is just > incomprehensible. Especially as __ClearPageMovable will not make __PageMovable fail, fun with code :) > > David, could you reformulate the changelog accordingly please? My ack > still holds. You mean reformulating + resending for stable kernels only? b1123ea6d3b3 was merged with v4.8. d6d86c0a7f8d was backported to v3.12+. So v3.12 - v4.7 are affected and will free pages to the LRU list. Without 195a8c43e93d ("virtio-balloon: deflate via a page list") - merged with v4.13 - this BUG is not immediately visible I guess. Pages are still added to the wrong list (LRU although they shouldn't) but the virtio-balloon local list does not exist/corrupt. I assume adding these pages to the LRU list is bad itself, right? If a distro backported 195a8c43e93d without b1123ea6d3b3, the BUG becomes directly visible (this is how we hit it). Thanks! -- Thanks, David / dhildenb