On Mon, Oct 18, 2021 at 4:31 PM Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > > On Fri, Oct 15, 2021 at 01:22:41AM +0100, Joao Martins wrote: > > > dev_pagemap_mapping_shift() does a lookup to figure out > > which order is the page table entry represents. is_zone_device_page() > > is already used to gate usage of dev_pagemap_mapping_shift(). I think > > this might be an artifact of the same issue as 3) in which PMDs/PUDs > > are represented with base pages and hence you can't do what the rest > > of the world does with: > > This code is looks broken as written. > > vma_address() relies on certain properties that I maybe DAX (maybe > even only FSDAX?) sets on its ZONE_DEVICE pages, and > dev_pagemap_mapping_shift() does not handle the -EFAULT return. It > will crash if a memory failure hits any other kind of ZONE_DEVICE > area. That case is gated with a TODO in memory_failure_dev_pagemap(). I never got any response to queries about what to do about memory failure vs HMM. > > I'm not sure the comment is correct anyhow: > > /* > * Unmap the largest mapping to avoid breaking up > * device-dax mappings which are constant size. The > * actual size of the mapping being torn down is > * communicated in siginfo, see kill_proc() > */ > unmap_mapping_range(page->mapping, start, size, 0); > > Beacuse for non PageAnon unmap_mapping_range() does either > zap_huge_pud(), __split_huge_pmd(), or zap_huge_pmd(). > > Despite it's name __split_huge_pmd() does not actually split, it will > call __split_huge_pmd_locked: > > } else if (!(pmd_devmap(*pmd) || is_pmd_migration_entry(*pmd))) > goto out; > __split_huge_pmd_locked(vma, pmd, range.start, freeze); > > Which does > if (!vma_is_anonymous(vma)) { > old_pmd = pmdp_huge_clear_flush_notify(vma, haddr, pmd); > > Which is a zap, not split. > > So I wonder if there is a reason to use anything other than 4k here > for DAX? > > > tk->size_shift = page_shift(compound_head(p)); > > > > ... as page_shift() would just return PAGE_SHIFT (as compound_order() is 0). > > And what would be so wrong with memory failure doing this as a 4k > page? device-dax does not support misaligned mappings. It makes hard guarantees for applications that can not afford the page table allocation overhead of sub-1GB mappings.