PageOffline: refcount, flags and memdesc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I'm currently staring again at PageOffline and wonder how we could prepare it for the memdesc future, and if we can remove refcount handling.


Currently, we set PageOffline in the following cases (one nast exception below):

(a) Memory blocks gets onlined, whereby we initialize all "struct pages
    to PageOffline + refcount of 1: memmap_init_range(). These pages are
    expected to get onlined via generic_online_page() later. Drivers
    might decide to leave some offline, because they are not backed by
    actual memory in the hypervisor. Some drivers still use free_page()
    instead of generic_online_page().

(b) We allocated pages (alloc_page(), alloc_contig_pages() ...) to
    logically offline them, whereby the refcount is set to 1 by the
    buddy and to PageOffline is set manually be the driver afterwards.

We clear PageOffline in the following cases (one nasty exception below):

(a) We want to return a page to the buddy (free_page/
    free_contig_page_range).
    PageOffline is cleared by the driver and freeing the page will
    decrement the refcount to 0.
(b) We want to expose it to the buddy the first time
    (generic_online_page). We will force the refcount to 0.

There are still subtle differences between onlining a page the first time to the buddy, such as debug_pagealloc_map_pages() in __free_pages_core(). I'm hoping we can get rid of them long-term, or just abstract it internally.


I'd like to stop using the refcount for PageOffline pages, and keep the refcount always at 0.

But the refcount, it is currently used to detect whether we are allowed to offline memory blocks that contain PageOffline pages, because only selected drivers support re-onlining. Well, and it is used when returning the pages to the buddy where free_page()/free_contig_range().... expect a refcount of 1.

Further, virtio-mem currently uses the PageDirty() bit to remember if a PageOffline page was already exposed to the buddy before, or if we must use generic_online_page().

For now we would need the following information, that could be stored in 2 flags, leaving the refcount at 0:

(1) Was it obtained from the buddy or never exposed it to the buddy

PageOffline() && PageOfflineNeverOnlined()

(2) The driver does support actual memory offlining+reonlining, they can
    be skipped when offlining.

PageOffline() && PageOfflineSkippable


But when allocating/freeing pages we would still mess with the refcount, which is bad.

We could have a dedicated interface for freeing them, where we abstract the generic_online_page() bits, and leave the refcount at 0:

free_offline_page()
free_offline_page_range()

And

alloc_offline_page()
alloc_offline_page_range()
alloc_offline_pages

I'm not super happy about the "alloc/free" terminology, but nothing better came to mind.


There is one complication to sort out: balloon_compaction.h supports moving PageOffline pages, and seems to use the page lock, page refcount, page lru, page private... which is all rather nasty. I wonder if these should get their own page type, like PageMovableOffline, and we'd mostly leave them alone for now. This would mean that virtio-balloon, vmware-balloon and ppc CMM would keep doing the old refcount-based thing but with a new page type.


I assume this all goes into the direction of getting pages from the buddy and returning them without refcounts ... thoughts?

--
Cheers,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux