Re: [PATCH 2/5] mm: vmscan: make memcg slab shrink lockless

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2023/2/23 03:58, Kirill Tkhai wrote:
On 22.02.2023 10:32, Qi Zheng wrote:


On 2023/2/22 05:28, Kirill Tkhai wrote:
On 20.02.2023 12:16, Qi Zheng wrote:
<...>
     void reparent_shrinker_deferred(struct mem_cgroup *memcg)
   {
-    int i, nid;
+    int i, nid, srcu_idx;
       long nr;
       struct mem_cgroup *parent;
       struct shrinker_info *child_info, *parent_info;
@@ -429,16 +443,16 @@ void reparent_shrinker_deferred(struct mem_cgroup *memcg)
           parent = root_mem_cgroup;
         /* Prevent from concurrent shrinker_info expand */
-    down_read(&shrinker_rwsem);
+    srcu_idx = srcu_read_lock(&shrinker_srcu);

Don't we still have to be protected against parallel expand_one_shrinker_info()?

It looks like parent->nodeinfo[nid]->shrinker_info pointer may be changed in expand*
right after we've dereferenced it here.

Hi Kirill,

Oh, indeed. We may wrongly reparent the child's nr_deferred to the old
parent's nr_deferred (it is about to be freed by call_srcu).

The reparent_shrinker_deferred() will only be called on the offline
path (not a hotspot path), so we may be able to use shrinker_mutex
introduced later for protection. What do you think?

It looks good for me. One more thing I'd analyzed is whether we want to have
is two reparent_shrinker_deferred() are executing in parallel.

I see that mem_cgroup_css_offline() is already protected by
cgroup_mutex, so maybe shrinker_mutex is enough here. :)


Possible, we should leave rwsem there as it was used before..

       for_each_node(nid) {
-        child_info = shrinker_info_protected(memcg, nid);
-        parent_info = shrinker_info_protected(parent, nid);
+        child_info = shrinker_info_srcu(memcg, nid);
+        parent_info = shrinker_info_srcu(parent, nid);
           for (i = 0; i < shrinker_nr_max; i++) {
               nr = atomic_long_read(&child_info->nr_deferred[i]);
               atomic_long_add(nr, &parent_info->nr_deferred[i]);
           }
       }
-    up_read(&shrinker_rwsem);
+    srcu_read_unlock(&shrinker_srcu, srcu_idx);
   }
     static bool cgroup_reclaim(struct scan_control *sc)
@@ -891,15 +905,14 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid,
   {
       struct shrinker_info *info;
       unsigned long ret, freed = 0;
+    int srcu_idx;
       int i;
         if (!mem_cgroup_online(memcg))
           return 0;
   -    if (!down_read_trylock(&shrinker_rwsem))
-        return 0;
-
-    info = shrinker_info_protected(memcg, nid);
+    srcu_idx = srcu_read_lock(&shrinker_srcu);
+    info = shrinker_info_srcu(memcg, nid);
       if (unlikely(!info))
           goto unlock;
   @@ -949,14 +962,9 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid,
                   set_shrinker_bit(memcg, nid, i);
           }
           freed += ret;
-
-        if (rwsem_is_contended(&shrinker_rwsem)) {
-            freed = freed ? : 1;
-            break;
-        }
       }
   unlock:
-    up_read(&shrinker_rwsem);
+    srcu_read_unlock(&shrinker_srcu, srcu_idx);
       return freed;
   }
   #else /* CONFIG_MEMCG */




--
Thanks,
Qi




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux