Matthew Wilcox wrote on Wed, Sep 04, 2019: > > - the vma was created with a vm_flags including VM_MIXEDMAP for some > > reason, I don't know why. > > If I change it to VM_PFNMAP (which sounds better here from the little I > > understand of this as we do not need cow and looks a bit simpler?), I > > can remove the vm_insert_page() path and use the vmf_insert_pfn one > > instead, which appears to work fine for simple programs... But the > > kernel thread for my network adapter (bxi... which is not upstream > > either I guess.. sigh..) no longer tries to fault via my custom .fault > > vm operation... Which means I probably did need MIXEDMAP ? > > Strange ... PFNMAP absolutely should try to fault via the ->fault > vm operation (although see below) It does fault in some context, just not in another.. A bit weird but I'll stick to MIXEDMAP for now - I'm really curious as to what the difference is, "normal" applications seem to work fine with either mode, it's only the bxi driver that > > I tried adding a huge_fault vm op thinking it might be called with a > > more appropriate pmd but it doesn't seem to be called at all in my > > case..? I would have assumed from the code that it would try every page > > You shouldn't be calling vmf_insert_pfn_pmd() from a regular ->fault > handler, as by then the fault handler has already inserted a PMD. > The ->huge_fault handler is the place to call it from. > > You may need to force PMD-alignment for your call to mmap(). I was missing setting the VM_HUGE_FAULT vm_flags2 bit in the vma - the huge_fault handler is now called, and I no longer have the pre-existing pmd problem; that's a much better solution than manually fiddling with flags :) Question though - is it ok to insert small pages if the huge_fault handler is called with PE_SIZE_PMD ? (I think the pte insertion will automatically create the pmd, but would be good to confirm) Now I've got this I'm back to where I stood with my kludge though, programs work until they exit, and the zap_huge_pmd() function tries to withdraw the pagetable from some magic field that was never set in my case... I realize this is old code no longer upstream, but my new workaround for this (looking at the zap_huge_pmd function) was to pretend my file is dax. Now that I've set it as dax I think it actually makes sense as in "there's memory here that points to something linux no longer manages directly, just let it be" and we might benefit from the other exceptions dax have, I'll need to look at what this implies in more details... > Hope these pointers are slightly more useful than a rubber duck ;-) Much appreciated, thank you for taking the time! :) Off to debug my network driver for the PFNMAP behaviour next, and then some more testing... I'm sure I broke something seemingly unrelated on the other side of the project! -- Dominique