On Wed 15-11-17 09:00:20, Johannes Weiner wrote: > On Wed, Nov 15, 2017 at 10:02:51AM +0100, Michal Hocko wrote: > > On Tue 14-11-17 06:37:42, Tetsuo Handa wrote: > > > This patch uses polling loop with short sleep for unregister_shrinker() > > > rather than wait_on_atomic_t(), for we can save reader's cost (plain > > > atomic_dec() compared to atomic_dec_and_test()), we can expect that > > > do_shrink_slab() of unregistering shrinker likely returns shortly, and > > > we can avoid khungtaskd warnings when do_shrink_slab() of unregistering > > > shrinker unexpectedly took so long. > > > > I would use wait_event_interruptible in the remove path rather than the > > short sleep loop which is just too ugly. The shrinker walk would then > > just wake_up the sleeper when the ref. count drops to 0. Two > > synchronize_rcu is quite ugly as well, but I was not able to simplify > > them. I will keep thinking. It just sucks how we cannot follow the > > standard rcu list with dynamically allocated structure pattern here. > > It's because the refcount is dropped too early. The refcount protects > the object during shrink, but not for the list_next(), and so you need > an additional grace period just for that part. Exactly > I think you could drop the reference count in the next iteration. This > way the list_next() works without requiring a second RCU grace period. That would work. I was playing with an idea of prefetching the next elemnt before dropping the reference but that would require a lock for the remove operation. Ugly... > ref count protects the object and its list pointers; RCU protects what > the list pointers point to before we acquire the reference: > > rcu_read_lock(); > list_for_each_entry_rcu(pos, list) { > if (!atomic_inc_not_zero(&pos->ref)) > continue; > rcu_read_unlock(); > > if (prev) > atomic_dec(&prev->ref); > prev = pos; > > shrink(); > > rcu_read_lock(); > } > rcu_read_unlock(); > if (prev) > atomic_dec(&prev->ref); > > In any case, Minchan's lock breaking seems way preferable over that > level of headscratching complexity for an unusual case like Shakeel's. agreed! I would go the more complex way only if it turns out that early break out causes some real problems. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>