Re: [PATCH v4 2/5] zswap: make shrinking memcg-aware

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



cc-ing Johannes, Roman, Shakeel, Muchun since you all know much more
about memory controller + list_lru reparenting logic than me.

There seems to be a race between memcg offlining and zswap’s
cgroup-aware LRU implementation:

CPU0                            CPU1
zswap_lru_add()                 mem_cgroup_css_offline()
    get_mem_cgroup_from_objcg()
                                    memcg_offline_kmem()
                                        memcg_reparent_objcgs()
                                        memcg_reparent_list_lrus()
                                            memcg_reparent_list_lru()
                                                memcg_reparent_list_lru_node()
    list_lru_add()
                                                memcg_list_lru_free()


Essentially: on CPU0, zswap gets the memcg from the entry's objcg
(before the objcgs are reparented). Then it performs list_lru_add()
after the list_lru entries reparenting (memcg_reparent_list_lru_node())
step. If the list_lru of the memcg being offlined has not been freed
(i.e before the memcg_list_lru_free() call), then the list_lru_add()
call would succeed - but the list will be freed soon after. The new
zswap entry as a result will not be subjected to future reclaim
attempt. IOW, this list_lru_add() call is effectively swallowed. And
worse, there might be a crash when we invalidate the zswap_entry in the
future (which will perform a list_lru removal).

Within get_mem_cgroup_from_objcg(), none of the following seem
sufficient to prevent this race:

    1. Perform the objcg-to-memcg lookup inside a rcu_read_lock()
    section.
    2. Checking if the memcg is freed yet (with css_tryget()) (what
    we're currently doing in this patch series).
    3. Checking if the memcg is still online (with css_tryget_online())
    The memcg can still be offlined down the line.


I've discussed this privately with Johannes, and it seems like the
cleanest solution here is to move the reparenting logic down to release
stage. That way, when get_mem_cgroup_from_objcg() returns,
zswap_lru_add() is given an memcg that is reparenting-safe (until we
drop the obtained reference).




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux