Michael Kelley <mikelley@xxxxxxxxxxxxx> writes: > From: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> Sent: Friday, May 10, 2019 6:22 AM >> >> >> >> I think we can consider these allocations being DMA-like (because >> >> Hypervisor accesses this memory too) so you can probably take a look at >> >> dma_pool_create()/dma_pool_alloc() and friends. >> >> >> > >> > I've taken a look at dma_pool_create(), and it takes a "struct device" >> > argument with which the DMA pool will be associated. That probably >> > makes DMA pools a bad choice for this usage. Pages need to be allocated >> > pretty early during boot for Hyper-V communication, and even if the >> > device subsystem is initialized early enough to create a fake device, >> > such a dependency seems rather dubious. >> >> We can probably use dma_pool_create()/dma_pool_alloc() from vmbus module >> but these 'early' allocations may not have a device to bind to indeed. >> >> > >> > kmem_cache_create/alloc() seems like the only choice to get >> > guaranteed alignment. Do you see any actual problem with >> > using kmem_cache_*, other than the naming? It seems like these >> > kmem_cache_* functions really just act as a sub-allocator for >> > known-size allocations, and "cache" is a common usage >> > pattern, but not necessarily the only usage pattern. >> >> Yes, it's basically the name - it makes it harder to read the code and >> some future refactoring of kmem_cache_* may not take our use-case into >> account (as we're misusing the API). We can try renaming it to something >> generic of course and see what -mm people have to say :-) >> > > This makes me think of creating Hyper-V specific alloc/free functions > that wrap whatever the backend allocator actually is. So we have > hv_alloc_hyperv_page() and hv_free_hyperv_page(). That makes the > code very readable and the intent is super clear. > > As for the backend allocator, an alternative is to write our own simple > allocator. It maintains a single free list. If hv_alloc_hyperv_page() is > called, and the free list is empty, do alloc_page() and break it up into > Hyper-V sized pages to replenish the free list. (On x86, these end up > being 1-for-1 operations.) hv_free_hyperv_page() just puts the Hyper-V > page back on the free list. Don't worry trying to combine and do > free_page() since there's very little free'ing done anyway. And I'm > assuming GPF_KERNEL is all we need. > > If in the future Linux provides an alternate general-purpose allocator > that guarantees alignment, we can ditch the simple allocator and use > the new mechanism with some simple code changes in one place. > > Thoughts? > +1 for adding wrappers and if the allocator turns out to be more-or-less trivial I think we can live with that for the time being. -- Vitaly