Hi Mel, mm experts, With 6.1-rc2 we started hitting the WARN_ON added in 71e2d666ef85 ("mm/huge_memory: do not clobber swp_entry_t during THP split") in i915 automated CI: It looks like this: <4> [259.367534] page:ffffea0008850000 refcount:0 mapcount:0 mapping:ffff88811a756a00 index:0x0 pfn:0x221400 <4> [259.367593] head:ffffea0008850000 order:9 compound_mapcount:0 compound_pincount:0 <4> [259.367596] aops:shmem_aops ino:2 dentry name:"i915" <4> [259.367600] flags: 0x80000000000d003f(locked|referenced|uptodate|dirty|lru|active|head|reclaim|swapbacked|zone=2) <4> [259.367604] raw: 80000000000d003f ffffea00042d08c8 ffffea0008855248 ffff88811a756a00 <4> [259.367606] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 <4> [259.367607] page dumped because: VM_WARN_ON_ONCE_PAGE(page_tail->private != 0) <4> [259.367613] ------------[ cut here ]------------ <4> [259.367614] WARNING: CPU: 2 PID: 5515 at mm/huge_memory.c:2465 split_huge_page_to_list+0x12de/0x1760 <4> [259.367619] Modules linked in: i915(+) drm_display_helper drm_kms_helper vgem drm_shmem_helper snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_pcm prime_numbers ttm drm_buddy syscopyarea sysfillrect sysimgblt fb_sys_fops fuse x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel igb ptp mei_me pps_core i2c_i801 i2c_smbus mei video acpi_power_meter wmi [last unloaded: i915] <4> [259.367663] CPU: 2 PID: 5515 Comm: i915_selftest Tainted: G U I 6.1.0-rc2-CI_DRM_12280-g7bb7f55322b3+ #1 <4> [259.367666] Hardware name: Intel Corporation S1200SP/S1200SP, BIOS S1200SP.86B.03.01.0026.092720170729 09/27/2017 <4> [259.367667] RIP: 0010:split_huge_page_to_list+0x12de/0x1760 <4> [259.367670] Code: 86 00 e9 31 fa ff ff 80 3d b8 3a 5a 01 00 0f 85 bb f4 ff ff 48 c7 c6 60 8c 2c 82 48 89 df e8 39 26 f9 ff c6 05 9c 3a 5a 01 01 <0f> 0b e9 9e f4 ff ff 48 83 e8 01 e9 a9 f8 ff ff 48 8b 45 08 49 89 <4> [259.367672] RSP: 0018:ffffc9000146b738 EFLAGS: 00010046 <4> [259.367675] RAX: 0000000000000042 RBX: ffffea0008850000 RCX: 0000000000000003 <4> [259.367677] RDX: 0000000000000000 RSI: ffffffff822ca515 RDI: 00000000ffffffff <4> [259.367678] RBP: ffffea0008855200 R08: 0000000000000000 R09: c0000000ffffdf42 <4> [259.367680] R10: 00000000001a5e18 R11: ffffc9000146b5d8 R12: ffffea0008850000 <4> [259.367681] R13: ffffea0008850000 R14: ffff88811a756a00 R15: 0000000000000200 <4> [259.367683] FS: 00007f42aafecc00(0000) GS:ffff88826b300000(0000) knlGS:0000000000000000 <4> [259.367685] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [259.367687] CR2: 000055792f60d9f0 CR3: 0000000107514005 CR4: 00000000003706e0 <4> [259.367688] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4> [259.367690] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 <4> [259.367691] Call Trace: <4> [259.367692] <TASK> <4> [259.367706] shmem_writepage+0x53/0x5e0 <4> [259.367710] ? folio_mapping+0x47/0x80 <4> [259.367715] __shmem_writeback+0x1f2/0x510 [i915] <4> [259.367853] shmem_shrink+0x3a/0x50 [i915] <4> [259.367961] i915_gem_shrink+0x57c/0x860 [i915] <4> [259.368083] igt_shrink_thp+0x362/0x490 [i915] <4> [259.368209] __i915_subtests.cold.7+0x42/0x92 [i915] <4> [259.368345] ? __i915_nop_teardown+0x10/0x10 [i915] <4> [259.368495] ? __i915_live_setup+0x30/0x30 [i915] <4> [259.368612] __run_selftests.part.3+0xfa/0x158 [i915] <4> [259.368747] i915_live_selftests.cold.5+0x1f/0x4f [i915] <4> [259.368878] i915_pci_probe+0xd6/0x240 [i915] <4> [259.368965] ? _raw_spin_unlock_irqrestore+0x3d/0x70 <4> [259.368971] pci_device_probe+0x98/0x110 <4> [259.368976] really_probe+0xd9/0x350 <4> [259.368979] ? pm_runtime_barrier+0x4b/0xa0 <4> [259.368985] __driver_probe_device+0x73/0x170 <4> [259.368989] driver_probe_device+0x1a/0x90 <4> [259.368992] __driver_attach+0xbc/0x190 <4> [259.368995] ? __device_attach_driver+0x110/0x110 <4> [259.368998] ? __device_attach_driver+0x110/0x110 <4> [259.369001] bus_for_each_dev+0x75/0xc0 <4> [259.369006] bus_add_driver+0x1bb/0x210 <4> [259.369012] driver_register+0x66/0xc0 <4> [259.369015] i915_init+0x22/0x82 [i915] <4> [259.369098] ? 0xffffffffa0860000 <4> [259.369101] do_one_initcall+0x56/0x2f0 <4> [259.369105] ? rcu_read_lock_sched_held+0x51/0x80 <4> [259.369109] ? kmalloc_trace+0xae/0x100 <4> [259.369113] do_init_module+0x45/0x1c0 <4> [259.369117] load_module+0x1d5e/0x1e90 <4> [259.369134] ? __do_sys_finit_module+0xaf/0x120 <4> [259.369137] __do_sys_finit_module+0xaf/0x120 <4> [259.369150] do_syscall_64+0x3a/0x90 <4> [259.369154] entry_SYSCALL_64_after_hwframe+0x63/0xcd <4> [259.369157] RIP: 0033:0x7f42ad7bd89d <4> [259.369159] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c3 f5 0c 00 f7 d8 64 89 01 48 <4> [259.369161] RSP: 002b:00007ffd788a2268 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 <4> [259.369164] RAX: ffffffffffffffda RBX: 000055d219ecf5a0 RCX: 00007f42ad7bd89d <4> [259.369166] RDX: 0000000000000000 RSI: 000055d219edb390 RDI: 0000000000000006 <4> [259.369167] RBP: 0000000000000020 R08: 00007ffd788a1040 R09: 000055d219ec3470 <4> [259.369168] R10: 00007ffd788a23b0 R11: 0000000000000246 R12: 000055d219edb390 <4> [259.369170] R13: 0000000000000000 R14: 000055d219eda730 R15: 000055d219ecf5a0 <4> [259.369180] </TASK> <4> [259.369181] irq event stamp: 65138522 <4> [259.369183] hardirqs last enabled at (65138521): [<ffffffff81b73764>] _raw_spin_unlock_irqrestore+0x54/0x70 <4> [259.369186] hardirqs last disabled at (65138522): [<ffffffff8129ca93>] split_huge_page_to_list+0x5f3/0x1760 <4> [259.369189] softirqs last enabled at (65138410): [<ffffffff81e0031e>] __do_softirq+0x31e/0x48a <4> [259.369191] softirqs last disabled at (65138403): [<ffffffff810c1b58>] irq_exit_rcu+0xb8/0xe0 <4> [259.369194] ---[ end trace 0000000000000000 ]--- At the point of the warn it should have been a single huge page and then we entered i915 shrinker which does this: void __shmem_writeback(size_t size, struct address_space *mapping) { struct writeback_control wbc = { .sync_mode = WB_SYNC_NONE, .nr_to_write = SWAP_CLUSTER_MAX, .range_start = 0, .range_end = LLONG_MAX, .for_reclaim = 1, }; unsigned long i; /* * Leave mmapings intact (GTT will have been revoked on unbinding, * leaving only CPU mmapings around) and add those pages to the LRU * instead of invoking writeback so they are aged and paged out * as normal. */ /* Begin writeback on each dirty page */ for (i = 0; i < size >> PAGE_SHIFT; i++) { struct page *page; page = find_lock_page(mapping, i); if (!page) continue; if (!page_mapped(page) && clear_page_dirty_for_io(page)) { int ret; SetPageReclaim(page); ret = mapping->a_ops->writepage(page, &wbc); if (!PageWriteback(page)) ClearPageReclaim(page); if (!ret) goto put; } unlock_page(page); put: put_page(page); } } Not sure if this loop is doing something incorrectly or what is going on. Help and suggestions would be appreciated. Regards, Tvrtko