Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes: > On Sun, Apr 26, 2020 at 7:14 AM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote: >> >> To support this add hlist_swap_before_rcu. An hlist primitive that >> will allow swaping the leading sections of two tasks. For exchanging >> the task pids it will just be swapping the hlist_heads of two single >> entry lists. But the functionality is more general. > > So I have no problems with the series now - the code is much more > understandable. Partly because of the split-up, partly because of the > comments, and partly because you explained the special case and why it > was a valid thing to do... > > However, I did start thinking about this case again. > > I still don't think the "swap entry" macro is necessarily useful in > _general_ - any time it's an actual individual entry, that swap macro > doesn't really work. But it isn't a "swap entry" macro/function. I did not even attempt to make it a "swap entry" function. I made a chop two lists into two and swap the pieces function. > So the only reason it works here is because you're actually swapping > the whole list. > > But that, in turn, shouldn't be using that "first node" model at all, > it should use the hlist_head. That would have made it a lot more > obvious what is actually going on to me. > > Now, the comment very much talks about the head case, but the code > still looks like it's swapping a non-head thing. > > I guess the code technically _works_ with "swap two list ends", but > does that actually really make sense as an operation? As an operation yes. Will anyone else want that operation I don't know. > So I no longer hate how this patch looks, but I wonder if we should > just make the whole "this node is the *first* node" a bit more > explicit in both the caller and in the swapping code. > > It could be as simple as replacing just the conceptual types and > names, so instead of some "pnode1" double-indirect node pointer, we'd > have > > struct hlist_head *left_head = container_of(left->pprev, > struct hlist_head, first); > struct hlist_head *right_head = container_of(right->pprev, > struct hlist_head, first); > > and then the code would do > > rcu_assign_pointer(right_head->first, left); > rcu_assign_pointer(left_head->first, right); > WRITE_ONCE(left->pprev, &right_head->first); > WRITE_ONCE(right->pprev, &left_head->first); > > which should generate the exact same code, but makes it clear that > what we're doing is switching the whole hlist when given the first > entries. > > Doesn't that make what it actually does a lot more understandable? Understandable is a bit subjective. I think having a well defined hlist operation I can call makes things more understandable. I think the getting the list head as: "head = &task->thread_pid->tasks[PIDTYPE_PID];" is more understandable and less risky than container_of. My concern and probably unreasonbable as this is a slow path with getting the list heads after looking up the pid is that it seems to add a wait for an additional cache line to load before anything can happen. The only way I really know to make this code much more understandable is to remove the lists entirely for this case. But that is a much larger change and it is not clear that it makes the kernel code overall better. I stared at that for a while and it is an interesting follow on but not something I want or we even can do before exchange_tids is in place. > The > *pnode1/pnode2 games are somewhat opaque, but with that type and name > change and using "container_of()", the code now fairly naturally reads > as "oh, we're changing the first pointers in the list heads, and > making the nodes point back to them" . > > Again - the current function _works_ with swapping two hlists in the > middle (not two entries - it swaps the whole list starting at that > entry!), so your current patch is in some ways "more generic". I'm > just suggesting that the generic case doesn't make much sense, and > that the "we know the first entries, swap the lists" actually is what > the real use is, and writing it as such makes the code easier to > understand. Yep. That is waht I designed it to do. I sort of went the other direction when writing this. I could start with the list heads and swap the rest of the lists and get the same code. But it looked like it would be a little slower to find the hlist_heads, and I couldn't think of a good name for the function. So I figured if I was writing a fucntion for this case I would write one that was convinient. For understandability that is my real challenge what is a good name that people can read and understand what is happening for this swapping function. > But I'm not going to insist on this, so this is more an RFC. Maybe > people disagree, and/or have an actual use case for that "break two > hlists in the middle, swap the ends" that I find unlikely... > > (NOTE: My "convert to hlist_head" code _works_ for that case too > because the code generation is the same! But it would be really really > confusing for that code to be used for anything but the first entry). Yes. I am open to improvements. Especially in the naming. Would hlists_swap_heads_rcu be noticably better? Eric