On 2018/09/06 23:39, Michal Hocko wrote: >>>> I know /proc/sys/vm/oom_dump_tasks . Showing some entries while not always >>>> printing all entries might be helpful. >>> >>> Not really. It could be more confusing than helpful. The main purpose of >>> the listing is to double check the list to understand the oom victim >>> selection. If you have a partial list you simply cannot do that. >> >> It serves as a safeguard for avoiding RCU stall warnings. >> >>> >>> If the iteration takes too long and I can imagine it does with zillions >>> of tasks then the proper way around it is either release the lock >>> periodically after N tasks is processed or outright skip the whole thing >>> if there are too many tasks. The first option is obviously tricky to >>> prevent from duplicate entries or other artifacts. >>> >> >> Can we add rcu_lock_break() like check_hung_uninterruptible_tasks() does? > > This would be a better variant of your timeout based approach. But it > can still produce an incomplete task list so it still consumes a lot of > resources to print a long list of tasks potentially while that list is not > useful for any evaluation. Maybe that is good enough. I don't know. I > would generally recommend to disable the whole thing with workloads with > many tasks though. > The "safeguard" is useful when there are _unexpectedly_ many tasks (like syzbot in this case). Why not to allow those who want to avoid lockup to avoid lockup rather than forcing them to disable the whole thing?