Hi David, On Mon, Dec 09, 2019 at 06:48:34PM +0100, David Hildenbrand wrote: > If max_pfn is not aligned to a section boundary, we can easily run into > BUGs. This can e.g., be triggered on x86-64 under QEMU by specifying a > memory size that is not a multiple of 128MB (e.g., 4097MB, but also > 4160MB). I was told that on real HW, we can easily have this scenario > (esp., one of the main reasons sub-section hotadd of devmem was added). > > The issue is, that we have a valid memmap (pfn_valid()) for the > whole section, and the whole section will be marked "online". > pfn_to_online_page() will succeed, but the memmap contains garbage. > > E.g., doing a "cat /proc/kpageflags > /dev/null" results in > > [ 303.218313] BUG: unable to handle page fault for address: fffffffffffffffe > [ 303.218899] #PF: supervisor read access in kernel mode > [ 303.219344] #PF: error_code(0x0000) - not-present page > [ 303.219787] PGD 12614067 P4D 12614067 PUD 12616067 PMD 0 > [ 303.220266] Oops: 0000 [#1] SMP NOPTI > [ 303.220587] CPU: 0 PID: 424 Comm: cat Not tainted 5.4.0-next-20191128+ #17 I can't reproduce this on x86-64 qemu, next-20191128 or mainline, with either memory size. What config are you using? How often are you hitting it? It may not have anything to do with the config, and I may be getting lucky with the garbage in my memory.