On Sat, 2019-08-17 at 23:25 -0400, Qian Cai wrote: > > On Aug 17, 2019, at 12:59 PM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > > > > On Sat, Aug 17, 2019 at 4:13 AM Qian Cai <cai@xxxxxx> wrote: > > > > > > > > > > > > > On Aug 16, 2019, at 11:57 PM, Dan Williams <dan.j.williams@xxxxxxxxx> > > > > wrote: > > > > > > > > On Fri, Aug 16, 2019 at 8:34 PM Qian Cai <cai@xxxxxx> wrote: > > > > > > > > > > > > > > > > > > > > > On Aug 16, 2019, at 5:48 PM, Dan Williams <dan.j.williams@xxxxxxxxx> > > > > > > wrote: > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:36 PM Qian Cai <cai@xxxxxx> wrote: > > > > > > > > > > > > > > Every so often recently, booting Intel CPU server on linux-next > > > > > > > triggers this > > > > > > > warning. Trying to figure out if the commit 7cc7867fb061 > > > > > > > ("mm/devm_memremap_pages: enable sub-section remap") is the > > > > > > > culprit here. > > > > > > > > > > > > > > # ./scripts/faddr2line vmlinux devm_memremap_pages+0x894/0xc70 > > > > > > > devm_memremap_pages+0x894/0xc70: > > > > > > > devm_memremap_pages at mm/memremap.c:307 > > > > > > > > > > > > Previously the forced section alignment in devm_memremap_pages() > > > > > > would > > > > > > cause the implementation to never violate the > > > > > > KASAN_SHADOW_SCALE_SIZE > > > > > > (12K on x86) constraint. > > > > > > > > > > > > Can you provide a dump of /proc/iomem? I'm curious what resource is > > > > > > triggering such a small alignment granularity. > > > > > > > > > > This is with memmap=4G!4G , > > > > > > > > > > # cat /proc/iomem > > > > > > > > [..] > > > > > 100000000-155dfffff : Persistent Memory (legacy) > > > > > 100000000-155dfffff : namespace0.0 > > > > > 155e00000-15982bfff : System RAM > > > > > 155e00000-156a00fa0 : Kernel code > > > > > 156a00fa1-15765d67f : Kernel data > > > > > 157837000-1597fffff : Kernel bss > > > > > 15982c000-1ffffffff : Persistent Memory (legacy) > > > > > 200000000-87fffffff : System RAM > > > > > > > > Ok, looks like 4G is bad choice to land the pmem emulation on this > > > > system because it collides with where the kernel is deployed and gets > > > > broken into tiny pieces that violate kasan's. This is a known problem > > > > with memmap=. You need to pick an memory range that does not collide > > > > with anything else. See: > > > > > > > > https://nvdimm.wiki.kernel.org/how_to_choose_the_correct_memmap_kernel > > > > _parameter_for_pmem_on_your_system > > > > > > > > ...for more info. > > > > > > Well, it seems I did exactly follow the information in that link, > > > > > > [ 0.000000] BIOS-provided physical RAM map: > > > [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x0000000000093fff] > > > usable > > > [ 0.000000] BIOS-e820: [mem 0x0000000000094000-0x000000000009ffff] > > > reserved > > > [ 0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] > > > reserved > > > [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000005a7a0fff] > > > usable > > > [ 0.000000] BIOS-e820: [mem 0x000000005a7a1000-0x000000005b5e0fff] > > > reserved > > > [ 0.000000] BIOS-e820: [mem 0x000000005b5e1000-0x00000000790fefff] > > > usable > > > [ 0.000000] BIOS-e820: [mem 0x00000000790ff000-0x00000000791fefff] > > > reserved > > > [ 0.000000] BIOS-e820: [mem 0x00000000791ff000-0x000000007b5fefff] ACPI > > > NVS > > > [ 0.000000] BIOS-e820: [mem 0x000000007b5ff000-0x000000007b7fefff] ACPI > > > data > > > [ 0.000000] BIOS-e820: [mem 0x000000007b7ff000-0x000000007b7fffff] > > > usable > > > [ 0.000000] BIOS-e820: [mem 0x000000007b800000-0x000000008fffffff] > > > reserved > > > [ 0.000000] BIOS-e820: [mem 0x00000000ff800000-0x00000000ffffffff] > > > reserved > > > [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000087fffffff] > > > usable > > > > > > Where 4G is good. Then, > > > > > > [ 0.000000] user-defined physical RAM map: > > > [ 0.000000] user: [mem 0x0000000000000000-0x0000000000093fff] usable > > > [ 0.000000] user: [mem 0x0000000000094000-0x000000000009ffff] reserved > > > [ 0.000000] user: [mem 0x00000000000e0000-0x00000000000fffff] reserved > > > [ 0.000000] user: [mem 0x0000000000100000-0x000000005a7a0fff] usable > > > [ 0.000000] user: [mem 0x000000005a7a1000-0x000000005b5e0fff] reserved > > > [ 0.000000] user: [mem 0x000000005b5e1000-0x00000000790fefff] usable > > > [ 0.000000] user: [mem 0x00000000790ff000-0x00000000791fefff] reserved > > > [ 0.000000] user: [mem 0x00000000791ff000-0x000000007b5fefff] ACPI NVS > > > [ 0.000000] user: [mem 0x000000007b5ff000-0x000000007b7fefff] ACPI data > > > [ 0.000000] user: [mem 0x000000007b7ff000-0x000000007b7fffff] usable > > > [ 0.000000] user: [mem 0x000000007b800000-0x000000008fffffff] reserved > > > [ 0.000000] user: [mem 0x00000000ff800000-0x00000000ffffffff] reserved > > > [ 0.000000] user: [mem 0x0000000100000000-0x00000001ffffffff] > > > persistent (type 12) > > > [ 0.000000] user: [mem 0x0000000200000000-0x000000087fffffff] usable > > > > > > The doc did mention that “There seems to be an issue with CONFIG_KSAN at > > > the moment however.” > > > without more detail though. > > > > Does disabling CONFIG_RANDOMIZE_BASE help? Maybe that workaround has > > regressed. Effectively we need to find what is causing the kernel to > > sometimes be placed in the middle of a custom reserved memmap= range. > > Yes, disabling KASLR works good so far. Assuming the workaround, i.e., > f28442497b5c > (“x86/boot: Fix KASLR and memmap= collision”) is correct. > > The only other commit that might regress it from my research so far is, > > d52e7d5a952c ("x86/KASLR: Parse all 'memmap=' boot option entries”) > It turns out that the origin commit f28442497b5c (“x86/boot: Fix KASLR and memmap= collision”) has a bug that is unable to handle "memmap=" in CONFIG_CMDLINE instead of a parameter in bootloader because when it (as well as the commit d52e7d5a952c) calls get_cmd_line_ptr() in order to run mem_avoid_memmap(), "boot_params" has no knowledge of CONFIG_CMDLINE. Only later in setup_arch(), the kernel will deal with parameters over there.