On Sat, Aug 17, 2019 at 4:13 AM Qian Cai <cai@xxxxxx> wrote: > > > > > On Aug 16, 2019, at 11:57 PM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > > > > On Fri, Aug 16, 2019 at 8:34 PM Qian Cai <cai@xxxxxx> wrote: > >> > >> > >> > >>> On Aug 16, 2019, at 5:48 PM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > >>> > >>> On Fri, Aug 16, 2019 at 2:36 PM Qian Cai <cai@xxxxxx> wrote: > >>>> > >>>> Every so often recently, booting Intel CPU server on linux-next triggers this > >>>> warning. Trying to figure out if the commit 7cc7867fb061 > >>>> ("mm/devm_memremap_pages: enable sub-section remap") is the culprit here. > >>>> > >>>> # ./scripts/faddr2line vmlinux devm_memremap_pages+0x894/0xc70 > >>>> devm_memremap_pages+0x894/0xc70: > >>>> devm_memremap_pages at mm/memremap.c:307 > >>> > >>> Previously the forced section alignment in devm_memremap_pages() would > >>> cause the implementation to never violate the KASAN_SHADOW_SCALE_SIZE > >>> (12K on x86) constraint. > >>> > >>> Can you provide a dump of /proc/iomem? I'm curious what resource is > >>> triggering such a small alignment granularity. > >> > >> This is with memmap=4G!4G , > >> > >> # cat /proc/iomem > > [..] > >> 100000000-155dfffff : Persistent Memory (legacy) > >> 100000000-155dfffff : namespace0.0 > >> 155e00000-15982bfff : System RAM > >> 155e00000-156a00fa0 : Kernel code > >> 156a00fa1-15765d67f : Kernel data > >> 157837000-1597fffff : Kernel bss > >> 15982c000-1ffffffff : Persistent Memory (legacy) > >> 200000000-87fffffff : System RAM > > > > Ok, looks like 4G is bad choice to land the pmem emulation on this > > system because it collides with where the kernel is deployed and gets > > broken into tiny pieces that violate kasan's. This is a known problem > > with memmap=. You need to pick an memory range that does not collide > > with anything else. See: > > > > https://nvdimm.wiki.kernel.org/how_to_choose_the_correct_memmap_kernel_parameter_for_pmem_on_your_system > > > > ...for more info. > > Well, it seems I did exactly follow the information in that link, > > [ 0.000000] BIOS-provided physical RAM map: > [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x0000000000093fff] usable > [ 0.000000] BIOS-e820: [mem 0x0000000000094000-0x000000000009ffff] reserved > [ 0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved > [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000005a7a0fff] usable > [ 0.000000] BIOS-e820: [mem 0x000000005a7a1000-0x000000005b5e0fff] reserved > [ 0.000000] BIOS-e820: [mem 0x000000005b5e1000-0x00000000790fefff] usable > [ 0.000000] BIOS-e820: [mem 0x00000000790ff000-0x00000000791fefff] reserved > [ 0.000000] BIOS-e820: [mem 0x00000000791ff000-0x000000007b5fefff] ACPI NVS > [ 0.000000] BIOS-e820: [mem 0x000000007b5ff000-0x000000007b7fefff] ACPI data > [ 0.000000] BIOS-e820: [mem 0x000000007b7ff000-0x000000007b7fffff] usable > [ 0.000000] BIOS-e820: [mem 0x000000007b800000-0x000000008fffffff] reserved > [ 0.000000] BIOS-e820: [mem 0x00000000ff800000-0x00000000ffffffff] reserved > [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000087fffffff] usable > > Where 4G is good. Then, > > [ 0.000000] user-defined physical RAM map: > [ 0.000000] user: [mem 0x0000000000000000-0x0000000000093fff] usable > [ 0.000000] user: [mem 0x0000000000094000-0x000000000009ffff] reserved > [ 0.000000] user: [mem 0x00000000000e0000-0x00000000000fffff] reserved > [ 0.000000] user: [mem 0x0000000000100000-0x000000005a7a0fff] usable > [ 0.000000] user: [mem 0x000000005a7a1000-0x000000005b5e0fff] reserved > [ 0.000000] user: [mem 0x000000005b5e1000-0x00000000790fefff] usable > [ 0.000000] user: [mem 0x00000000790ff000-0x00000000791fefff] reserved > [ 0.000000] user: [mem 0x00000000791ff000-0x000000007b5fefff] ACPI NVS > [ 0.000000] user: [mem 0x000000007b5ff000-0x000000007b7fefff] ACPI data > [ 0.000000] user: [mem 0x000000007b7ff000-0x000000007b7fffff] usable > [ 0.000000] user: [mem 0x000000007b800000-0x000000008fffffff] reserved > [ 0.000000] user: [mem 0x00000000ff800000-0x00000000ffffffff] reserved > [ 0.000000] user: [mem 0x0000000100000000-0x00000001ffffffff] persistent (type 12) > [ 0.000000] user: [mem 0x0000000200000000-0x000000087fffffff] usable > > The doc did mention that “There seems to be an issue with CONFIG_KSAN at the moment however.” > without more detail though. Does disabling CONFIG_RANDOMIZE_BASE help? Maybe that workaround has regressed. Effectively we need to find what is causing the kernel to sometimes be placed in the middle of a custom reserved memmap= range.