On Wed, Aug 24, 2022 at 8:18 AM Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote: > > The pcp_spin_lock_irqsave protecting the PCP lists is IRQ-safe as a task > allocating from the PCP must not re-enter the allocator from IRQ context. > In each instance where IRQ-reentrancy is possible, the lock is acquired using > pcp_spin_trylock_irqsave() even though IRQs are disabled and re-entrancy > is impossible. > > Demote the lock to pcp_spin_lock avoids an IRQ disable/enable in the common > case at the cost of some IRQ allocations taking a slower path. If the PCP > lists need to be refilled, the zone lock still needs to disable IRQs but > that will only happen on PCP refill and drain. If an IRQ is raised when > a PCP allocation is in progress, the trylock will fail and fallback to > using the buddy lists directly. Note that this may not be a universal win > if an interrupt-intensive workload also allocates heavily from interrupt > context and contends heavily on the zone->lock as a result. Hi, This patch caused the following warning. Please take a look. Thanks. WARNING: inconsistent lock state 6.0.0-dbg-DEV #1 Tainted: G S W O -------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. ksoftirqd/2/27 [HC0[0]:SC1[1]:HE0:SE0] takes: ffff9ce5002b8c58 (&pcp->lock){+.?.}-{2:2}, at: free_unref_page_list+0x1ac/0x260 {SOFTIRQ-ON-W} state was registered at: lock_acquire+0xb3/0x190 _raw_spin_trylock+0x46/0x60 rmqueue_pcplist+0x42/0x1d0 rmqueue+0x58/0x590 get_page_from_freelist+0x2c3/0x510 __alloc_pages+0x126/0x210 alloc_page_interleave+0x13/0x90 alloc_pages+0xfb/0x250 __get_free_pages+0x11/0x30 __pte_alloc_kernel+0x1c/0xc0 vmap_p4d_range+0x448/0x690 ioremap_page_range+0xdc/0x130 __ioremap_caller+0x258/0x320 ioremap_cache+0x17/0x20 acpi_os_map_iomem+0x12f/0x1d0 acpi_os_map_memory+0xe/0x10 acpi_tb_acquire_table+0x42/0x6e acpi_tb_validate_temp_table+0x43/0x55 acpi_tb_verify_temp_table+0x31/0x238 acpi_reallocate_root_table+0xe6/0x158 acpi_early_init+0x4f/0xd1 start_kernel+0x32a/0x44f x86_64_start_reservations+0x24/0x26 x86_64_start_kernel+0x124/0x12b secondary_startup_64_no_verify+0xe6/0xeb irq event stamp: 961581 hardirqs last enabled at (961580): [<ffffffff95b2cde5>] _raw_spin_unlock_irqrestore+0x35/0x50 hardirqs last disabled at (961581): [<ffffffff951c1998>] folio_rotate_reclaimable+0xf8/0x310 softirqs last enabled at (961490): [<ffffffff94fa40d8>] run_ksoftirqd+0x48/0x90 softirqs last disabled at (961495): [<ffffffff94fa40d8>] run_ksoftirqd+0x48/0x90 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&pcp->lock); <Interrupt> lock(&pcp->lock); *** DEADLOCK *** 1 lock held by ksoftirqd/2/27: #0: ffff9ce5002adab8 (lock#7){..-.}-{2:2}, at: local_lock_acquire+0x0/0x70 stack backtrace: CPU: 2 PID: 27 Comm: ksoftirqd/2 Tainted: G S W O 6.0.0-dbg-DEV #1 Call Trace: <TASK> dump_stack_lvl+0x6c/0x9a dump_stack+0x10/0x12 print_usage_bug+0x374/0x380 mark_lock_irq+0x4a8/0x4c0 ? save_trace+0x40/0x2c0 mark_lock+0x137/0x1b0 __lock_acquire+0x5bf/0x3540 ? __SCT__tp_func_virtio_transport_recv_pkt+0x7/0x8 ? lock_is_held_type+0x96/0x130 ? rcu_read_lock_sched_held+0x49/0xa0 lock_acquire+0xb3/0x190 ? free_unref_page_list+0x1ac/0x260 _raw_spin_lock+0x2f/0x40 ? free_unref_page_list+0x1ac/0x260 free_unref_page_list+0x1ac/0x260 release_pages+0x90a/0xa70 ? folio_batch_move_lru+0x138/0x190 ? local_lock_acquire+0x70/0x70 folio_batch_move_lru+0x147/0x190 folio_rotate_reclaimable+0x168/0x310 folio_end_writeback+0x5d/0x200 end_page_writeback+0x18/0x40 end_swap_bio_write+0x100/0x2b0 ? bio_chain+0x30/0x30 bio_endio+0xd8/0xf0 blk_update_request+0x173/0x340 scsi_end_request+0x2a/0x300 scsi_io_completion+0x66/0x140 scsi_finish_command+0xc0/0xf0 scsi_complete+0xec/0x110 blk_done_softirq+0x53/0x70 __do_softirq+0x1e2/0x357 ? run_ksoftirqd+0x48/0x90 run_ksoftirqd+0x48/0x90 smpboot_thread_fn+0x14b/0x1c0 kthread+0xe6/0x100 ? cpu_report_death+0x50/0x50 ? kthread_blkcg+0x40/0x40 ret_from_fork+0x1f/0x30 </TASK>