在 2024/7/20 03:43, Michal Hocko 写道:
[...]
+ set_active_memcg(old_memcg);
It looks correct. But it's going through all dance to set up
current->active_memcg, then have the charge path look that up,
css_get(), call try_charge() only to bail immediately, css_put(), then
update current->active_memcg again. All those branches are necessary
when we want to charge to a "real" other cgroup. But in this case, we
always know we're not charging, so it seems uncalled for.
Wouldn't it be a lot simpler (and cheaper) to have a
filemap_add_folio_nocharge()?
Yes, that would certainly simplify things. From the previous discussion
I understood that there would be broader scopes which would opt-out from
charging. If this is really about a single filemap_add_folio call then
having a variant without doesn't call mem_cgroup_charge sounds like a
much more viable option and also it doesn't require to make any memcg
specific changes.
Talking about skipping mem cgroup charging, I still have a question.
[MEMCG AT FOLIO EVICTION TIME]
Even we completely skip the mem cgroup charging, we cannot really escape
the eviction time handling.
In fact if we just skip the mem_cgroup_charge(), kernel would crash when
evicting the folio.
As in lru_gen_eviction(), folio_memcg() would just return NULL, and
mem_cgroup_id(memcg) would trigger a NULL pointer dereference.
That's why I sent out a patch fixing that first:
https://lore.kernel.org/linux-mm/e1036b9cc8928be9a7dec150ab3a0317bd7180cf.1720572937.git.wqu@xxxxxxxx/
I'm not sure if it's going to cause any extra problems even with the
above fix.
And just for the sake of consistency, it looks more sane to have
root_mem_cgroup for the filemap_add_folio() operation, other than leave
it empty, especially since most filemaps still need proper memcg handling.
[REALLY EXPENSIVE?]
Another question is, is the set_active_memcg() and later handling really
that expensive?
set_active_memcg() is small enough to be an inline function, so is the
active_memcg(), css_get() and the root memcg path of try_charge().
Later commit part is not that expensive either, mostly simple member or
per-cpu assignment.
According to my very little knowledge about mem cgroup, most of the
heavy lifting part is in the slow path of try_charge_memcg().
Even with all the set_active_memcg(), the whole extra overhead still
look very tiny.
And it should already be a big win for btrfs to opt-out the regular
charging routine.
Thanks,
Qu