Re: [PATCH v6 42/45] mm: shrinker: make global slab shrink lockless

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 2023/12/6 15:47, Lai Jiangshan wrote:
On Tue, Sep 12, 2023 at 9:57 PM Qi Zheng <zhengqi.arch@xxxxxxxxxxxxx> wrote:

-       if (!down_read_trylock(&shrinker_rwsem))
-               goto out;
-
-       list_for_each_entry(shrinker, &shrinker_list, list) {
+       /*
+        * lockless algorithm of global shrink.
+        *
+        * In the unregistration setp, the shrinker will be freed asynchronously
+        * via RCU after its refcount reaches 0. So both rcu_read_lock() and
+        * shrinker_try_get() can be used to ensure the existence of the shrinker.
+        *
+        * So in the global shrink:
+        *  step 1: use rcu_read_lock() to guarantee existence of the shrinker
+        *          and the validity of the shrinker_list walk.
+        *  step 2: use shrinker_try_get() to try get the refcount, if successful,
+        *          then the existence of the shrinker can also be guaranteed,
+        *          so we can release the RCU lock to do do_shrink_slab() that
+        *          may sleep.
+        *  step 3: *MUST* to reacquire the RCU lock before calling shrinker_put(),
+        *          which ensures that neither this shrinker nor the next shrinker
+        *          will be freed in the next traversal operation.

Hello, Qi, Andrew, Paul,

I wonder know how RCU can ensure the lifespan of the next shrinker.
it seems it is diverged from the common pattern usage of RCU+reference.

cpu1:
rcu_read_lock();
shrinker_try_get(this_shrinker);
rcu_read_unlock();
     cpu2: shrinker_free(this_shrinker);
     cpu2: shrinker_free(next_shrinker); and free the memory of next_shrinker
     cpu2: when shrinker_free(next_shrinker), no one updates this_shrinker's next
     cpu2: since this_shrinker has been removed first.

No, this_shrinker will not be removed from the shrinker_list until the
last refcount is released. See below:

rcu_read_lock();
shrinker_put(this_shrinker);

	CPU 1                                      CPU 2

  --> if (refcount_dec_and_test(&shrinker->refcount))
		complete(&shrinker->done);

				wait_for_completion(&shrinker->done);
                                list_del_rcu(&shrinker->list);

travel to the freed next_shrinker.

a quick simple fix:

// called with other references other than RCU (i.e. refcount)
static inline rcu_list_deleted(struct list_head *entry)
{
    // something like this:
    return entry->prev == LIST_POISON2;
}

// in the loop
if (rcu_list_deleted(&shrinker->list)) {
    shrinker_put(shrinker);
    goto restart;
}
rcu_read_lock();
shrinker_put(shrinker);

Thanks
Lai

+        *  step 4: do shrinker_put() paired with step 2 to put the refcount,
+        *          if the refcount reaches 0, then wake up the waiter in
+        *          shrinker_free() by calling complete().
+        */
+       rcu_read_lock();
+       list_for_each_entry_rcu(shrinker, &shrinker_list, list) {
                 struct shrink_control sc = {
                         .gfp_mask = gfp_mask,
                         .nid = nid,
                         .memcg = memcg,
                 };

+               if (!shrinker_try_get(shrinker))
+                       continue;
+
+               rcu_read_unlock();
+
                 ret = do_shrink_slab(&sc, shrinker, priority);
                 if (ret == SHRINK_EMPTY)
                         ret = 0;
                 freed += ret;
-               /*
-                * Bail out if someone want to register a new shrinker to
-                * prevent the registration from being stalled for long periods
-                * by parallel ongoing shrinking.
-                */
-               if (rwsem_is_contended(&shrinker_rwsem)) {
-                       freed = freed ? : 1;
-                       break;
-               }
+
+               rcu_read_lock();
+               shrinker_put(shrinker);
         }





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux