Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes: > On Thu, Apr 23, 2020 at 12:42 PM Eric W. Biederman > <ebiederm@xxxxxxxxxxxx> wrote: >> >> +void exchange_tids(struct task_struct *ntask, struct task_struct *otask) >> +{ >> + /* pid_links[PIDTYPE_PID].next is always NULL */ >> + struct pid *npid = READ_ONCE(ntask->thread_pid); >> + struct pid *opid = READ_ONCE(otask->thread_pid); >> + >> + rcu_assign_pointer(opid->tasks[PIDTYPE_PID].first, &ntask->pid_links[PIDTYPE_PID]); >> + rcu_assign_pointer(npid->tasks[PIDTYPE_PID].first, &otask->pid_links[PIDTYPE_PID]); >> + rcu_assign_pointer(ntask->thread_pid, opid); >> + rcu_assign_pointer(otask->thread_pid, npid); >> + WRITE_ONCE(ntask->pid_links[PIDTYPE_PID].pprev, &opid->tasks[PIDTYPE_PID].first); >> + WRITE_ONCE(otask->pid_links[PIDTYPE_PID].pprev, &npid->tasks[PIDTYPE_PID].first); >> + WRITE_ONCE(ntask->pid, pid_nr(opid)); >> + WRITE_ONCE(otask->pid, pid_nr(npid)); >> +} > > This function is _very_ hard to read as written. > > It really wants a helper function to do the swapping per hlist_head > and hlist_node, I think. And "opid/npid" is very hard to see, and the > naming doesn't make much sense (if it's an "exchange", then why is it > "old/new" - they're symmetric). > > At least something like > > struct hlist_head *old_pid_hlist = opid->tasks + PIDTYPE_PID; > struct hlist_head *new_pid_hlist = npid->tasks + PIDTYPE_PID; > struct hlist_node *old_pid_node = otask->pid_links + PIDTYPE_PID; > struct hlist_node *new_pid_node = ntask->pid_links + PIDTYPE_PID; > > struct hlist_node *old_first_node = old_pid_hlist->first; > struct hlist_node *new_first_node = new_pid_hlist->first; > > and then trying to group up the first/pprev/thread_pid/pid accesses > so that you them together, and using a helper function that does the > whole switch, so that you'd have > > /* Move new node to old hlist, and update thread_pid/pid fields */ > insert_pid_pointers(old_pid_hlist, new_pid_node, new_first_node); > rcu_assign_pointer(ntask->thread_pid, opid); > WRITE_ONCE(ntask->pid, pid_nr(opid)); > > /* Move old new to new hlist, and update thread_pid/pid fields */ > insert_pid_pointers(new_pid_hlist, old_pid_node, old_first_node); > rcu_assign_pointer(otask->thread_pid, npid); > WRITE_ONCE(otask->pid, pid_nr(npid)); > > or something roughly like that. > > (And the above still uses "old/new", which as mentioned sounds wrong > to me. Maybe it should just be "a_xyz" and "b_xyz"? Also note that I > did this in my MUA, so I could have gotten the names and types wrong > etc). > > I think that would make it look at least _slightly_ less like random > line noise and easier to follow. > > But maybe even a rcu_hlist_swap() helper? We have one for regular > lists. Do we really have to do it all written out, not do it with a > "remove and reinsert" model? At one point my brain I had forgetten that xchg can not take two memory arguments and had hoped to be able to provide stronger guarnatees than I can. Which is where I think the structure of exchange_pids came from. I do agree the clearer we can write things, the easier it is for someone else to come along and follow. We can not use a remove and reinser model because that does break rcu accesses, and complicates everything else. With a swap model we have the struct pids pointer at either of the tasks that are swapped but never at nothing. With a remove/reinsert model we have to deal the addittional possibility of the pids not pointing at a thread at all which can result in things like signals not being delivered at all. I played with it a bit and the best I have been able to come up is: void hlist_swap_before_rcu(struct hlist_node *left, struct hlist_node *right) { struct hlist_node **lpprev = left->pprev; struct hlist_node **rpprev = right->pprev; rcu_assign_pointer(*lpprev, right); rcu_assign_pointer(*rpprev, left); WRITE_ONCE(left->pprev, rpprev); WRITE_ONCE(right->pprev, lpprev); } void exchange_tids(struct task_struct *left, struct task_struct *right) { struct hlist_node *lnode = &left->pid_links[PIDTYPE_PID]; struct hlist_node *rnode = &right->pid_links[PIDTYPE_PID]; struct pid *lpid, *rpid; /* Replace the single entry tid lists with each other */ hlist_swap_before_rcu(lnode, rnode); /* Swap thread_pid */ rpid = left->thread_pid; lpid = right->thread_pid; rcu_assign_pointer(left->thread_pid, lpid); rcu_assign_pointer(right->thread_pid, rpid); /* Swap the cached pid value */ WRITE_ONCE(left->pid, pid_nr(lpid)); WRITE_ONCE(right->pid, pid_nr(rpid)); } hlists because they are not doubly linked can legitimately swap their beginnings or their tails. Something that regular lists can not, and I think that is exactly the general purpose semantic I want. Does that look a little more readable? Eric