On Wed, May 18, 2022 at 10:15:03AM -0700, Paul E. McKenney wrote: > So does this python script somehow change the tracing state? (It does > not look to me like it does, but I could easily be missing something.) No, I don't think so either. It pretty much just offline memory sections one at a time. > Either way, is there something else waiting for these RCU flavors? > (There should not be.) Nevertheless, if so, there should be > a synchronize_rcu_tasks(), synchronize_rcu_tasks_rude(), or > synchronize_rcu_tasks_trace() on some other blocked task's stack > somewhere. There are only three blocked tasks when this happens. The kmemleak_scan() is just the victim waiting for the locks taken by the stucking offline_pages()->synchronize_rcu() task. task:kmemleak state:D stack:25824 pid: 1033 ppid: 2 flags:0x00000008 Call trace: __switch_to __schedule schedule percpu_rwsem_wait __percpu_down_read percpu_down_read.constprop.0 get_online_mems kmemleak_scan kmemleak_scan_thread kthread ret_from_fork task:cppc_fie state:D stack:23472 pid: 1848 ppid: 2 flags:0x00000008 Call trace: __switch_to __schedule lockdep_recursion task:tee state:D stack:24816 pid:16733 ppid: 16732 flags:0x0000020c Call trace: __switch_to __schedule schedule schedule_timeout __wait_for_common wait_for_completion __wait_rcu_gp synchronize_rcu lru_cache_disable __alloc_contig_migrate_range isolate_single_pageblock start_isolate_page_range offline_pages memory_subsys_offline device_offline online_store dev_attr_store sysfs_kf_write kernfs_fop_write_iter new_sync_write vfs_write ksys_write __arm64_sys_write invoke_syscall el0_svc_common.constprop.0 do_el0_svc el0_svc el0t_64_sync_handler el0t_64_sync > Or maybe something sleeps waiting for an RCU Tasks * callback to > be invoked. In that case (and in the above case, for that matter), > at least one of these pointers would be non-NULL on some CPU: > > 1. rcu_tasks__percpu.cblist.head > 2. rcu_tasks_rude__percpu.cblist.head > 3. rcu_tasks_trace__percpu.cblist.head > > The ->func field of the pointed-to structure contains a pointer to > the callback function, which will help work out what is going on. > (Most likely a wakeup being lost or not provided.) What would be some of the easy ways to find out those? I can't see anything interesting from the output of sysrq-t. > Alternatively, if your system has hundreds of thousands of tasks and > you have attached BPF programs to short-lived socket structures and you > don't yet have the workaround, then you can see hangs. (I am working on a > longer-term fix.) In the short term, applying the workaround is the right > thing to do. (Adding a couple of the BPF guys on CC for their thoughts.) The system is pretty much idle after a fresh reboot. The only workload is to run the script.