On 14.11.18 22:16, David Hildenbrand wrote: > Right now, pages inflated as part of a balloon driver will be dumped > by dump tools like makedumpfile. While XEN is able to check in the > crash kernel whether a certain pfn is actuall backed by memory in the > hypervisor (see xen_oldmem_pfn_is_ram) and optimize this case, dumps of > virtio-balloon and hv-balloon inflated memory will essentially result in > zero pages getting allocated by the hypervisor and the dump getting > filled with this data. > > The allocation and reading of zero pages can directly be avoided if a > dumping tool could know which pages only contain stale information not to > be dumped. > > Also for XEN, calling into the kernel and asking the hypervisor if a > pfn is backed can be avoided if the duming tool would skip such pages > right from the beginning. > > Dumping tools have no idea whether a given page is part of a balloon driver > and shall not be dumped. Esp. PG_reserved cannot be used for that purpose > as all memory allocated during early boot is also PG_reserved, see > discussion at [1]. So some other way of indication is required and a new > page flag is frowned upon. > > We have PG_balloon (MAPCOUNT value), which is essentially unused now. I > suggest renaming it to something more generic (PG_offline) to mark pages as > logically offline. This flag can than e.g. also be used by virtio-mem in > the future to mark subsections as offline. Or by other code that wants to > put pages logically offline (e.g. later maybe poisoned pages that shall > no longer be used). > > This series converts PG_balloon to PG_offline, allows dumping tools to > query the value to detect such pages and marks pages in the hv-balloon > and XEN balloon properly as PG_offline. Note that virtio-balloon already > set pages to PG_balloon (and now PG_offline). > > Please note that this is also helpful for a problem we were seeing under > Hyper-V: Dumping logically offline memory (pages kept fake offline while > onlining a section via online_page_callback) would under some condicions > result in a kernel panic when dumping them. > > As I don't have access to neither XEN nor Hyper-V installation, this was > not tested yet (and a makedumpfile change will be required to skip > dumping these pages). > > [1] https://lkml.org/lkml/2018/7/20/566 > > David Hildenbrand (6): > mm: balloon: update comment about isolation/migration/compaction > mm: convert PG_balloon to PG_offline > kexec: export PG_offline to VMCOREINFO > xen/balloon: mark inflated pages PG_offline > hv_balloon: mark inflated pages PG_offline > PM / Hibernate: exclude all PageOffline() pages > > Documentation/admin-guide/mm/pagemap.rst | 6 +++++ > drivers/hv/hv_balloon.c | 14 ++++++++-- > drivers/xen/balloon.c | 3 +++ > fs/proc/page.c | 4 +-- > include/linux/balloon_compaction.h | 34 +++++++++--------------- > include/linux/page-flags.h | 11 +++++--- > include/uapi/linux/kernel-page-flags.h | 1 + > kernel/crash_core.c | 2 ++ > kernel/power/snapshot.c | 5 +++- > tools/vm/page-types.c | 1 + > 10 files changed, 51 insertions(+), 30 deletions(-) > I just did a test with virtio-balloon (and a very simple makedumpfile patch which I can supply on demand). 1. Guest with 8GB. Inflate balloon to 4GB via sudo virsh setmem f29 --size 4096M --live 2. Trigger a kernel panic in the guest echo 1 > /proc/sys/kernel/sysrq echo c > /proc/sysrq-trigger Original pages : 0x00000000001e1da8 Excluded pages : 0x00000000001c9221 Pages filled with zero : 0x00000000000050b0 Non-private cache pages : 0x0000000000046547 Private cache pages : 0x0000000000002165 User process data pages : 0x00000000000048cf Free pages : 0x00000000000771f6 Hwpoison pages : 0x0000000000000000 Offline pages : 0x0000000000100000 Remaining pages : 0x0000000000018b87 (The number of pages is reduced to 5%.) Memory Hole : 0x000000000009e258 -------------------------------------------------- Total pages : 0x0000000000280000 (Offline patches matches the 4GB) -- Thanks, David / dhildenb