On Tue, 2023-05-23 at 14:33 -0700, Dave Hansen wrote: > On 5/23/23 14:25, Sean Christopherson wrote: > > > There are consequences for converting pages between shared and > > > private. > > > Doing it on a vmalloc() mapping is guaranteed to fracture the > > > underlying > > > EPT/SEPT mappings. > > > > > > How does this work with load_unaligned_zeropad()? Couldn't it be > > > running around poking at one of these vmalloc()'d pages via the > > > direct > > > map during a shared->private conversion before the page has been > > > accepted? > > Would it be feasible and sensible to add a GFP_SHARED or whatever, > > to communicate > > to the core allocators that the page is destined to be converted to > > a shared page? > > I assume that would provide a common place (or two) for initiating > > conversions, > > and would hopefully allow for future optimizations, e.g. to keep > > shared allocation > > in the same pool or whatever. Sharing memory without any > > intelligence as to what > > memory is converted is going to make both the guest and host sad. > > I don't think we want a GFP flag. This is still way too specialized > to > warrant one of those. > > It sounds like a similar problem to what folks want for modules or > BPF. > There are a bunch of allocations that are related and can have some > of > their setup/teardown costs amortized if they can be clumped together. > > For BPF, the costs are from doing RW=>RO in the kernel direct map, > and > fracturing it in the process. > > Here, the costs are from the private->shared conversions and > fracturing > both the direct map and the EPT/SEPT. > > I just don't know if there's anything that we can reuse from the BPF > effort. Last I saw the BPF allocator was focused on module space memory and packing multiple allocations into pages. So that would probably have to be generalized to support this. But also, a lot of the efforts around improving direct map modification efficiency have sort of assumed all of the types of special permissions could be grouped together on the direct map and just mapped elsewhere in whatever permission they needed. This does seem like a special case where it really needs to be grouped together only with similar permissions. If a GFP flag is too precious, what about something like PKS tables[0] tried. It kind of had a similar problem with respect to preferring physically contiguous special permissioned memory on the direct map. It did this with a cache of converted pages outside the page allocator and a separate alloc() function to get at it. This one could be vmalloc_shared() or something. Don't know, just tossing it out there. [0] https://lore.kernel.org/lkml/20210830235927.6443-7-rick.p.edgecombe@xxxxxxxxx/