On Thu, May 19, 2022 at 12:15:24PM -0700, Paul E. McKenney wrote: > Is the task doing offline_pages()->synchronize_rcu() doing this > repeatedly? Or is there a stalled RCU grace period? (From what > I can see, offline_pages() is not doing huge numbers of calls to > synchronize_rcu() in any of its loops, but I freely admit that I do not > know this code.) Yes, we are running into an endless loop in isolate_single_pageblock(). There was a similar issue happened not long ago, so I am wondering if we did not solve it entirely then. Anyway, I will continue the thread over there. https://lore.kernel.org/all/YoavU%2F+NfQIzQiDF@qian/ > Or is it possible that reverting those three patches simply decreases > the probability of failure, rather than eliminating the failure? > Such a decrease could be due to many things, for example, changes to > offsets and sizes of data structures. Entirely possible. Sorry for the false alarm. > Do you ever see RCU CPU stall warnings? No.