Re: [PATCH] mm/page_table_check: Fix crash on ZONE_DEVICE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Pasha Tatashin wrote:
> On Wed, Jun 5, 2024 at 5:21 PM Peter Xu <peterx@xxxxxxxxxx> wrote:
> >
> > Not all pages may apply to pgtable check.  One example is ZONE_DEVICE
> > pages: they map PFNs directly, and they don't allocate page_ext at all even
> > if there's struct page around.  One may reference devm_memremap_pages().
> >
> > When both ZONE_DEVICE and page-table-check enabled, then try to map some
> > dax memories, one can trigger kernel bug constantly now when the kernel was
> > trying to inject some pfn maps on the dax device:
> >
> >  kernel BUG at mm/page_table_check.c:55!
> >
> > While it's pretty legal to use set_pxx_at() for ZONE_DEVICE pages for page
> > fault resolutions, skip all the checks if page_ext doesn't even exist in
> > pgtable checker, which applies to ZONE_DEVICE but maybe more.
> 
> Thank you for reporting this bug. A few comments below:
> 
> >
> > Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
> > Cc: Pasha Tatashin <pasha.tatashin@xxxxxxxxxx>
> > Signed-off-by: Peter Xu <peterx@xxxxxxxxxx>
> > ---
> >  mm/page_table_check.c | 11 ++++++++++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/page_table_check.c b/mm/page_table_check.c
> > index 4169576bed72..509c6ef8de40 100644
> > --- a/mm/page_table_check.c
> > +++ b/mm/page_table_check.c
> > @@ -73,6 +73,9 @@ static void page_table_check_clear(unsigned long pfn, unsigned long pgcnt)
> >         page = pfn_to_page(pfn);
> >         page_ext = page_ext_get(page);
> >
> > +       if (!page_ext)
> > +               return;
> 
> I would replace the above with the following, here and in other places:
> 
> if (!page_ext) {
>   WARN_ONCE(!is_zone_device_page(page),
>                           "page_ext is missing for a non-device page\n");
>   return;
> }

Hmm, but this function is silent for the !pfn_valid(@pfn) case, and the
old cold has BUG_ON(!page_ext). So we know the caller is not being
careful about @pfn, and existing code is likely avoiding the BUG_ON().

The justification for the WARN_ONCE(), or maybe VM_WARN_ONCE(), would
be if there is a high likelihood that ongoing kernel changes introduce
more pfn_valid() but not page_ext covered pages? Is that a realistic
scenario?




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux