On Fri, Jul 06, 2018 at 04:33:58PM -0600, Ross Zwisler wrote: > The following commit in -next: > > commit 054620849110 ("mm/sparse.c: make sparse_init_one_section void and > remove check") > > changed how the error handling in sparse_add_one_section() works. > > Previously sparse_index_init() could return -EEXIST, and the function would > continue on happily. 'ret' would get unconditionally overwritten by the > result from sparse_init_one_section() and the error code after the 'out:' > label wouldn't be triggered. > > With the above referenced commit, though, an -EEXIST error return from > sparse_index_init() now takes us through the function and into the error > case after 'out:'. This eventually causes a kernel BUG, probably because > we've just freed a memory section that we successfully set up and marked as > present: > > BUG: unable to handle kernel paging request at ffffea0005000080 > RIP: 0010:memmap_init_zone+0x154/0x1cf > > Call Trace: > move_pfn_range_to_zone+0x168/0x180 > devm_memremap_pages+0x29b/0x480 > pmem_attach_disk+0x1ae/0x6c0 [nd_pmem] > ? devm_memremap+0x79/0xb0 > nd_pmem_probe+0x7e/0xa0 [nd_pmem] > nvdimm_bus_probe+0x6e/0x160 [libnvdimm] > driver_probe_device+0x310/0x480 > __device_attach_driver+0x86/0x100 > ? __driver_attach+0x110/0x110 > bus_for_each_drv+0x6e/0xb0 > __device_attach+0xe2/0x160 > device_initial_probe+0x13/0x20 > bus_probe_device+0xa6/0xc0 > device_add+0x41b/0x660 > ? lock_acquire+0xa3/0x210 > nd_async_device_register+0x12/0x40 [libnvdimm] > async_run_entry_fn+0x3e/0x170 > process_one_work+0x230/0x680 > worker_thread+0x3f/0x3b0 > kthread+0x12f/0x150 > ? process_one_work+0x680/0x680 > ? kthread_create_worker_on_cpu+0x70/0x70 > ret_from_fork+0x3a/0x50 > > Fix this by clearing 'ret' back to 0 if sparse_index_init() returns > -EEXIST. This restores the previous behavior. > > Signed-off-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx> Reviewed-by: Oscar Salvador <osalvador@xxxxxxx> -- Oscar Salvador SUSE L3