On Mon, May 8, 2023 at 3:46 PM David Hildenbrand <david@xxxxxxxxxx> wrote: > > On 08.05.23 23:55, Pasha Tatashin wrote: > > On Mon, May 8, 2023 at 2:52 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > >> > >> On Mon, May 08, 2023 at 02:48:59PM -0700, Pasha Tatashin wrote: > >>> On Mon, May 8, 2023 at 2:36 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > >>>> > >>>> On Mon, May 08, 2023 at 05:27:10PM -0400, Pasha Tatashin wrote: > >>>>>> static void page_table_check_set(struct mm_struct *mm, unsigned long addr, > >>>>>> unsigned long pfn, unsigned long pgcnt, > >>>>>> bool rw) > >>>>>> { > >>>>>> // ... > >>>>>> anon = PageAnon(page); > >>>>>> for (i = 0; i < pgcnt; i++) { > >>>>>> // ... > >>>>>> if (anon) { > >>>>>> BUG_ON(atomic_read(&ptc->file_map_count)); > >>>>>> BUG_ON(atomic_inc_return(&ptc->anon_map_count) > 1 && rw); > >>>>>> } else { > >>>>>> BUG_ON(atomic_read(&ptc->anon_map_count)); > >>>>>> BUG_ON(atomic_inc_return(&ptc->file_map_count) < 0); > >>>>>> } > >>>>>> // ... > >>>>>> } > >>>>>> // ... > >>>>>> } > >>>>>> > >>>>>> This call to PageAnon is invalid for slab pages because slab reuses the bits > >>>>>> in struct page/folio to store its internal states, and the anonymity bit only > >>>>>> exists in struct page/folio. As a result, the counters are incorrectly updated > >>>>>> and checked in page_table_check_set and page_table_check_clear, leading to the > >>>>>> bug being raised. > >>>>> > >>>>> We should change anon boolean to be: > >>>>> > >>>>> anon = !PageSlab(page) && PageAnon(page); > >>>> > >>>> No. Slab pages are not elegible for mapping into userspace. That's > >>> > >>> Sure, I can add BUG_ON(PageSlab(page)); to page_table_check_set. > >>> > >>>> all. There should be a BUG() for that. And I do mean BUG(), not > >>>> "return error to user". Something has gone horribly wrong, and it's > >>>> time to crash. > >>> > >>> It is just too easy to make slab available via remap_pfn_range(), but > >>> I do not think we want to add BUG() into the remap function, otherwise > >>> we will break devices such as /dev/mem. > >> > >> Slab pages can't be mmaped. Really, no matter what interface you're > >> using. page->_mapcount is necessarily incremented by mapping to > >> userspace, and slab uses that space for its own purposes (and has > >> for decades). It's similar for page tables and other allocations that > >> use PageType. > > > > Mapping random memory in /dev/mem can cause mapping slab pages in to > > userspace, the page->_mapcount is not incremented (and other fields > > are not accessed) in that case, as we are using VM_PFNMAP type VMA, > > which does not access "struct page". > > We should be using vm_normal_page() to identify if we should be looking > at the struct page or not, no? For normal Kernel-MM operations, vm_normal_page() should be used to get "struct page" based on vma+addr+pte combination, but page_table_check does not use vma for its operation in order to strengthen the verification of no invalid page sharing. But, even vm_normal_page() can cause access to the "struct page" for VM_PFNMAP if pfn_valid(pfn) is true. So, vm_normal_page() can return a struct page for a user mapped slab page. Pasha > > -- > Thanks, > > David / dhildenb >