On Tue, Feb 22, 2022 at 09:44:22AM +0100, Christoph Hellwig wrote: > On Mon, Feb 21, 2022 at 02:57:34PM +0100, Heiko Carstens wrote: > > > 1) Kmalloc(GFP_DMA) in s390 platform, under arch/s390 and drivers/s390; > > > > So, s390 partially requires GFP_DMA allocations for memory areas which > > are required by the hardware to be below 2GB. There is not necessarily > > a device associated when this is required. E.g. some legacy "diagnose" > > calls require buffers to be below 2GB. > > > > How should something like this be handled? I'd guess that the > > dma_alloc API is not the right thing to use in such cases. Of course > > we could say, let's waste memory and use full pages instead, however > > I'm not sure this is a good idea. > > Yeah, I don't think the DMA API is the right thing for that. This > is one of the very rare cases where a raw allocation makes sense. > > That being said being able to drop kmalloc support for GFP_DMA would > be really useful. How much memory would we waste if switching to the > page allocator? At a first glance this would not waste much memory, since most callers seem to allocate such memory pieces only temporarily. > > The question is: what would this buy us? As stated above I'd assume > > this comes with quite some code churn, so there should be a good > > reason to do this. > > There is two steps here. One is to remove GFP_DMA support from > kmalloc, which would help to cleanup the slab allocator(s) very nicely, > as at that point it can stop to be zone aware entirely. Well, looking at slub.c it looks like there is only a very minimal maintenance burden for GPF_DMA/GFP_DMA32 support. > The long term goal is to remove ZONE_DMA entirely at least for > architectures that only use the small 16MB ISA-style one. It can > then be replaced with for example a CMA area and fall into a movable > zone. I'd have to prototype this first and see how it applies to the > s390 case. It might not be worth it and maybe we should replace > ZONE_DMA and ZONE_DMA32 with a ZONE_LIMITED for those use cases as > the amount covered tends to not be totally out of line for what we > built the zone infrastructure. So probably I'm missing something; but for small systems where we would only have ZONE_DMA, how would a CMA area within this zone improve things? If I'm not mistaken then the page allocator will not fallback to any CMA area for GFP_KERNEL allocations. That is: we would somehow need to find "the right size" for the CMA area, depending on memory size. This looks like a new problem class which currently does not exist. Besides that we would also not have all the debugging options provided by the slab allocator anymore. Anyway, maybe it would make more sense if you would send your patch and then we can see where we would end up.