On Tue, May 19, 2020 at 05:41:15PM -0700, David Rientjes wrote: > Hi Greg and everyone, > > On all kernels, SEV enabled guests hit might_sleep() warnings when a > driver (nvme in this case) allocates through the DMA API in a > non-blockable context: > > BUG: sleeping function called from invalid context at mm/vmalloc.c:1710 > in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 3383, name: fio > 2 locks held by fio/3383: > #0: ffff93b6a8568348 (&sb->s_type->i_mutex_key#16){+.+.}, at: ext4_file_write_iter+0xa2/0x5d0 > #1: ffffffffa52a61a0 (rcu_read_lock){....}, at: hctx_lock+0x1a/0xe0 > CPU: 0 PID: 3383 Comm: fio Tainted: G W 5.5.10 #14 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > Call Trace: > dump_stack+0x98/0xd5 > ___might_sleep+0x175/0x260 > __might_sleep+0x4a/0x80 > _vm_unmap_aliases+0x45/0x250 > vm_unmap_aliases+0x19/0x20 > __set_memory_enc_dec+0xa4/0x130 > set_memory_decrypted+0x10/0x20 > dma_direct_alloc_pages+0x148/0x150 > dma_direct_alloc+0xe/0x10 > dma_alloc_attrs+0x86/0xc0 > dma_pool_alloc+0x16f/0x2b0 > nvme_queue_rq+0x878/0xc30 [nvme] > __blk_mq_try_issue_directly+0x135/0x200 > blk_mq_request_issue_directly+0x4f/0x80 > blk_mq_try_issue_list_directly+0x46/0xb0 > blk_mq_sched_insert_requests+0x19b/0x2b0 > blk_mq_flush_plug_list+0x22f/0x3b0 > blk_flush_plug_list+0xd1/0x100 > blk_finish_plug+0x2c/0x40 > iomap_dio_rw+0x427/0x490 > ext4_file_write_iter+0x181/0x5d0 > aio_write+0x109/0x1b0 > io_submit_one+0x7d0/0xfa0 > __x64_sys_io_submit+0xa2/0x280 > do_syscall_64+0x5f/0x250 > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > There is a series of patches in Christoph's dma-mapping.git repo in the > for-next branch on track for 5.8: > > 1d659236fb43 dma-pool: scale the default DMA coherent pool size with memory capacity > 82fef0ad811f x86/mm: unencrypted non-blocking DMA allocations use coherent pools > 2edc5bb3c5cc dma-pool: add pool sizes to debugfs > 76a19940bd62 dma-direct: atomic allocations must come from atomic coherent pools > 54adadf9b085 dma-pool: dynamically expanding atomic pools > c84dc6e68a1d dma-pool: add additional coherent pools to map to gfp mask > e860c299ac0d dma-remap: separate DMA atomic pools from direct remap code > > We'd like to prepare backports to LTS kernels so that our guest images are > not modified by us and don't exhibit this issue. > > They are bigger than we'd like: > > arch/x86/Kconfig | 1 + > drivers/iommu/dma-iommu.c | 5 +- > include/linux/dma-direct.h | 2 + > include/linux/dma-mapping.h | 6 +- > kernel/dma/Kconfig | 6 +- > kernel/dma/Makefile | 1 + > kernel/dma/direct.c | 56 ++++++-- > kernel/dma/pool.c | 264 ++++++++++++++++++++++++++++++++++++ > kernel/dma/remap.c | 121 +---------------- > 9 files changed, 324 insertions(+), 138 deletions(-) > create mode 100644 kernel/dma/pool.c > > But they apply relatively cleanly to more modern kernels like 5.4. We'd > like to backport these all the way to 4.19, however, otherwise guests > encounter these bugs. > > The changes to kernel/dma/remap.c, for example, simply moves code to the > new pool.c. But that original code is actually in arch/arm64 in 4.19 and > was moved in 5.0: > > commit 0c3b3171ceccb8830c2bb5adff1b4e9b204c1450 > Author: Christoph Hellwig <hch@xxxxxx> > Date: Sun Nov 4 20:29:28 2018 +0100 > > dma-mapping: move the arm64 noncoherent alloc/free support to common code > > commit f0edfea8ef93ed6cc5f747c46c85c8e53e0798a0 > Author: Christoph Hellwig <hch@xxxxxx> > Date: Fri Aug 24 10:31:08 2018 +0200 > > dma-mapping: move the remap helpers to a separate file > > And there are most certainly more dependencies to get a cleanly applying > series to 4.19.123. So the backports could be quite extensive. > > Peter Gonda <pgonda@xxxxxxxxxx> is currently handling these and we're > looking for advice: should we compile a full list of required backports > that would be needed to get a series that would only consist of minor > conflicts or is this going to be a non-starter? A full series would be good. Once these hit Linus's tree and show up in a -rc or two, feel free to send on the backports and we can look at them then. thanks, greg k-h