On 06/24/21 at 11:47am, Robin Murphy wrote: > On 2021-06-24 10:29, Baoquan He wrote: > > On 06/24/21 at 08:40am, Christoph Hellwig wrote: > > > So reduce the amount allocated. But the pool is needed for proper > > > operation on systems with memory encryption. And please add the right > > > maintainer or at least mailing list for the code you're touching next > > > time. > > > > Oh, I thoutht it's memory issue only, should have run > > ./scripts/get_maintainer.pl. sorry. > > > > About reducing the amount allocated, it may not help. Because on x86_64, > > kdump kernel doesn't put any page of memory into buddy allocator of DMA > > zone. Means it will defenitely OOM for atomic_pool_dma initialization. > > > > Wondering in which case or on which device the atomic pool is needed on > > AMD system with mem encrytion enabled. As we can see, the OOM will > > happen too in kdump kernel on Intel system, even though it's not > > necessary. Sorry for very late response, and thank both for your comments. > > Hmm, I think the Kconfig reshuffle has actually left a slight wrinkle here. > For DMA_DIRECT_REMAP=y we can assume an atomic pool is always needed, since > that was the original behaviour anyway. However the implications of > AMD_MEM_ENCRYPT=y are different - even if support is enabled, it still > should only be relevant if mem_encrypt_active(), so it probably does make > sense to have an additional runtime gate on that. > > From a quick scan, use of dma_alloc_from_pool() already depends on > force_dma_unencrypted() so that's probably fine already, but I think we'd > need a bit of extra protection around dma_free_from_pool() to prevent > gen_pool_has_addr() dereferencing NULL if the pools are uninitialised, even > with your proposed patch as it is. Presumably nothing actually called > dma_direct_free() when you tested this? Yes, enforcing the conditional check of force_dma_unencrypted() around dma_free_from_pool sounds reasonable, just as we have done in dma_alloc_from_pool(). I have tested this patchset on normal x86_64 systems and one amd system with SME support, disabling atomic pool can fix the issue that there's no managed pages in dma zone then requesting page from dma zone will cause allocation failure. And even disabling atomic pool in 1st kernel didn't cause any problem on one AMD EPYC system which supports SME. I am not expert of DMA area, wondering how atomic pool is supposed to do in SME/SEV system. Besides, even though atomic pool is disabled, slub page for allocation of dma-kmalloc also triggers page allocation failure. So I change to take another way to fix them, please check v2 post. The atomic pool disabling an be a good to have change. Thanks Baoquan