Oleg Nesterov <oleg@xxxxxxxxxx> writes: > OK, it seems that you are not going to take these preparatory > cleanups ;) > > I'll resend along with the s/next_thread/__next_thread/ change. > I was going to do the last change later, but this recent discussion > https://lore.kernel.org/all/20230824143112.GA31208@xxxxxxxxxx/ > makes me think we should do this right now. For the record I find this code confusing, and wrong. It looks like it wants to keep the task_struct pointer or possibly the struct pid pointer like proc does, but then it winds up keeping a userspace pid value and regenerating both the struct pid pointer and the struct task_struct pointer. Which means that task_group_seq_get_next is unnecessarily slow and has a built in race condition which means it could wind up iterating through a different process. This whole thing looks to be a bad (aka racy) reimplementation of first_tid and next_tid from proc. I thought the changes were to adapt to the needs of bpf, but on closer examination the code is just racy. For this code to be correct bpf_iter_seq_task_common needs to store at a minimum a struct pid pointer. Oleg your patch makes it easier to see what the how far this is from first_tid/next_tid in proc. Acked-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx> Eric