Re: Can the huge zero page be partially mapped?

Yang Shi <shy828301@xxxxxxxxx> · Mon, 4 Mar 2024 11:19:36 -0800

On Mon, Mar 4, 2024 at 8:54 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>
> I looked at the definition of is_huge_zero_page():
>
> static inline bool is_huge_zero_page(struct page *page)
> {
>         return READ_ONCE(huge_zero_page) == page;
> }
>
> That made me raise my eyebrows a bit because it will return false for
> tail pages of the HZP (that was at least unexpected for me).  Then we
> have this beauty:
>
> void free_page_and_swap_cache(struct page *page)
> {
>         struct folio *folio = page_folio(page);
>
>         free_swap_cache(folio);
>         if (!is_huge_zero_page(page))
>                 folio_put(folio);
> }
>
> So if we can call free_page_and_swap_cache() with a tail of the HZP
> we can absolutely screw up its refcounting.  Now, we have VM_BUGs
> to catch the refcount going below 0, and I haven't seen them being
> hit, so I _presume_ it doesn't happen, but maybe somebody inventive
> could come up with a way of putting a HZP tail into a page table ...?

The huge zero pmd split is specially handled by
__split_huge_zero_page_pmd(), which actually replaces every subpages
of HZP to zero page.

>