On Wed, Mar 15, 2023 at 5:09 PM Waiman Long <longman@xxxxxxxxxx> wrote: > > > On 3/15/23 17:43, Yosry Ahmed wrote: > > On Wed, Mar 15, 2023 at 5:19 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > >> On Wed, Mar 15, 2023 at 12:04:10AM -0700, Yosry Ahmed wrote: > >>> On Tue, Mar 14, 2023 at 9:54 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > >>>> On Mon, Mar 13, 2023 at 02:08:53PM -0700, Yosry Ahmed wrote: > >>>>> On Mon, Mar 13, 2023 at 12:44 PM Andrew Morton > >>>>> <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > >>>>>> On Mon, 13 Mar 2023 08:34:52 +0000 Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote: > >>>>>> > >>>>>>> From: Hugh Dickins <hughd@xxxxxxxxxx> > >>>>>>> > >>>>>>> In a kernel with added WARN_ON_ONCE(PageTail) in page_memcg_check(), we > >>>>>>> observed a warning from page_cgroup_ino() when reading > >>>>>>> /proc/kpagecgroup. > >>>>>> If this is the only known situation in which page_memcg_check() is > >>>>>> passed a tail page, why does page_memcg_check() have > >>>>>> > >>>>>> if (PageTail(page)) > >>>>>> return NULL; > >>>>>> > >>>>>> ? Can we remove this to simplify, streamline and clarify? > >>>>> I guess it's a safety check so that we don't end up trying to cast a > >>>>> tail page to a folio. My opinion is to go one step further and change > >>>>> page_memcg_check() to do return the memcg of the head page, i.e: > >>>>> > >>>>> static inline struct mem_cgroup *page_memcg_check(struct page *page) > >>>>> { > >>>>> return folio_memcg_check(page_folio(page)); > >>>>> } > >>>> If you look at my commit becacb04fdd4, I was preserving the existing > >>>> behaviour of page_memcg_check() when passed a tail page. It would > >>>> previously, rightly or wrongly, read the memcg_data from the tail page > >>>> and get back NULL. > >>> Right, I looked at that. I also looked at 1b7e4464d43a which added > >>> folio_memcg() and changed page_memcg()'s behavior to use page_folio() > >>> to retrieve the memcg from the head, which made me wonder why > >>> different decisions were made for these 2 helpers. > >>> > >>> Were the users of page_memcg() already passing in head pages only? > >> There were 18 months between those commits ... I'd learned to be > >> more careful about maintaining the semantics instead of changing > >> them to "what they should have been". > >> > >>>> I suspect that was not the intended behaviour, but I do not think this > >>>> patch is the right fix; it simply papers over the problem and maybe > >>>> creates a new one. Callers of page_memcg_check() should be eliminated, > >>>> precisely because of this ambiguity. It's up to the people who understand > >>>> each of the callers who need to make the decision to always convert the > >>>> page that they have to a folio and ask about its memcg, or whether they > >>>> want to preserve the existing behaviour of returning NULL for tail pages. > >>>> > >>>> So, I say NACK to this patch as it does not preserve existing behaviour, > >>>> and does not advance our understanding of what we have wrought. > >>> I am not sure which patch you are NACKing, the original patch from > >>> Hugh (adding compound_head() to page_cgroup_ino()) or the suggested > >>> alternative patch which changes page_memcg_check() to use > >>> page_folio(). > >> Both patches are NACKed. page_memcg_check() needs to go away > >> because it has the tail page ambiguity. Both callers should be using > >> folio_memcg_check() directly and resolving for themselves what the > >> correct behaviour is when seeing a tail page. > >> > > I agree. I even suggested this to Michal in one of the replies. > > > > For page_cgroup_ino() we can simply pass in > > folio_memcg(page_folio(page)), which would mimic what Hugh's patch is > > doing for page_cgroup_ino(). > > > > For page owner, I am not sure if we want to do something similar > > (which would start printing the memcg for tail pages as well), or > > explicitly excluding tail pages and THEN do > > folio_memcg(page_folio(page)) to get the memcg of head pages. Waiman, > > what do you think? > > I prefer the current behavior of printing information for the head page > only. I am open to suggestion of the best APIs to use. I think instead of explicitly checking page->memcg_data, we can check PageTail() and return explicitly for tail pages tails, check PageSlab() to print the message for slab pages, then get the page's memcg through folio_memcg_check(page_folio(page)). Something like: static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret, struct page *page) { ... rcu_read_lock(); /* Only head pages hold refs to a memcg */ if (PageTail(page)) goto out_unlock; if (PageSlab(page)) ret += scnprintf(kbuf + ret, count - ret, "Slab cache page\n"); memcg = folio_memcg_check(page_folio(page)); if (!memcg) goto out_unlock; ... } Matthew, What do you think? > > Cheers, > Longman >