On Fri, 9 May 2014, Paul E. McKenney wrote: > On Sat, May 10, 2014 at 12:57:15AM +0200, Thomas Gleixner wrote: > > On Fri, 9 May 2014, Christoph Lameter wrote: > > > On Fri, 9 May 2014, Thomas Gleixner wrote: > > > > I understand why you want to get this done by a housekeeper, I just > > > > did not understand why we need this whole move it around business is > > > > required. > > > > > > This came about because of another objection against having it simply > > > fixed to a processor. After all that processor may be disabled etc etc. > > > > I really regret that I did not pay more attention (though my cycle > > constraints simply do not allow it). > > As far as I can see, the NO_HZ_FULL timekeeping CPU is always zero. If it > can change in NO_HZ_FULL kernels, RCU will do some very strange things! Good. I seriously hope it stays that way. > One possible issue here is that Christoph's patch is unconditional. > It takes effect for both NO_HZ_FULL and !NO_HZ_FULL. If I recall > correctly, the timekeeping CPU -can- change in !NO_HZ_FULL kernels, > which might be what Christoph was trying to take into account. Ok. Sorry, I was just in a lousy mood after wasting half a day in reviewing even lousier patches related to that NO_HZ* muck. So, right with NO_HZ_IDLE the time keeper can move around and housekeeping stuff might want to move around as well. But it's not necessary a good idea to bundle that with the timekeeper, as under certain conditions the timekeeper duty can move around fast and left unassigned again when the system is fully idle. And we really do not want a gazillion of sites which implement a metric ton of different ways to connect some random housekeeping jobs with the timekeeper. So the proper solution to this is to have either a thread or a dedicated housekeeping worker, which is placed by the scheduler depending on the system configuration and workload. That way it can be kept at cpu0 for the nohz=off and the nohz_full case. In the nohz_idle case we can have different placement algorithms. On a big/little ARM machine you probably want to keep it on the first cpu of one or the other cluster. And there might be other constraints on servers. So we are way better of with a generic facility, where the various housekeeping jobs can be queued. Does that make sense? Thanks, tglx -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>