[ adding KASAN devs...] On Mon, Jun 4, 2018 at 4:40 PM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > On Sun, Jun 3, 2018 at 6:48 PM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote: >> On Sun, Jun 3, 2018 at 5:25 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: >>> On Mon, Jun 04, 2018 at 08:20:38AM +1000, Dave Chinner wrote: >>>> On Thu, May 31, 2018 at 09:02:52PM -0700, Dan Williams wrote: >>>> > On Thu, May 31, 2018 at 7:24 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: >>>> > > On Thu, May 31, 2018 at 06:57:33PM -0700, Dan Williams wrote: >>>> > >> > FWIW, XFS+DAX used to just work on this setup (I hadn't even >>>> > >> > installed ndctl until this morning!) but after changing the kernel >>>> > >> > it no longer works. That would make it a regression, yes? >>>> >>>> [....] >>>> >>>> > >> I suspect your kernel does not have CONFIG_ZONE_DEVICE enabled which >>>> > >> has the following dependencies: >>>> > >> >>>> > >> depends on MEMORY_HOTPLUG >>>> > >> depends on MEMORY_HOTREMOVE >>>> > >> depends on SPARSEMEM_VMEMMAP >>>> > > >>>> > > Filesystem DAX now has a dependency on memory hotplug? >>>> >>>> [....] >>>> >>>> > > OK, works now I've found the magic config incantantions to turn >>>> > > everything I now need on. >>>> >>>> By enabling these options, my test VM now has a ~30s pause in the >>>> boot very soon after the nvdimm subsystem is initialised. >>>> >>>> [ 1.523718] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled >>>> [ 1.550353] 00:05: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A >>>> [ 1.552175] Non-volatile memory driver v1.3 >>>> [ 2.332045] tsc: Refined TSC clocksource calibration: 2199.909 MHz >>>> [ 2.333280] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x1fb5dcd4620, max_idle_ns: 440795264143 ns >>>> [ 37.217453] brd: module loaded >>>> [ 37.225423] loop: module loaded >>>> [ 37.228441] virtio_blk virtio2: [vda] 10485760 512-byte logical blocks (5.37 GB/5.00 GiB) >>>> [ 37.245418] virtio_blk virtio3: [vdb] 146800640 512-byte logical blocks (75.2 GB/70.0 GiB) >>>> [ 37.255794] virtio_blk virtio4: [vdc] 1073741824000 512-byte logical blocks (550 TB/500 TiB) >>>> [ 37.265403] nd_pmem namespace1.0: unable to guarantee persistence of writes >>>> [ 37.265618] nd_pmem namespace0.0: unable to guarantee persistence of writes >>>> >>>> The system does not appear to be consuming CPU, but it is blocking >>>> NMIs so I can't get a CPU trace. For a VM that I rely on booting in >>>> a few seconds because I reboot it tens of times a day, this is a >>>> problem.... >>> >>> And when I turn on KASAN, the kernel fails to boot to a login prompt >>> because: >> >> What's your qemu and kernel command line? I'll take look at this first >> thing tomorrow. > > I was able to reproduce this crash by just turning on KASAN... > investigating. It would still help to have your config for our own > regression testing purposes it makes sense for us to prioritize > "Dave's test config", similar to the priority of not breaking Linus' > laptop. I believe this is a bug in KASAN, or a bug in devm_memremap_pages(), depends on your point of view. At the very least it is a mismatch of assumptions. KASAN learns of hot added memory via the memory hotplug notifier. However, the devm_memremap_pages() implementation is intentionally limited to the "first half" of the memory hotplug procedure. I.e. it does just enough to setup the linear map for pfn_to_page() and initialize the "struct page" memmap, but then stops short of onlining the pages. This is why we are getting a NULL ptr deref and not a KASAN report, because KASAN has no shadow area setup for the linearly mapped pmem range. In terms of solving it we could refactor kasan_mem_notifier() so that devm_memremap_pages() can call it outside of the notifier... I'll give this a shot.