Re: kernel BUG at mm/huge_memory.c:2613!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2020-06-22 13:46, Joerg Roedel wrote:
+ Robin

Robin, any idea on this?

After a bit of archaeology, this dates back to the original review:

https://lore.kernel.org/linux-arm-kernel/54C285D4.3070802@xxxxxxx/
https://lore.kernel.org/linux-arm-kernel/54DA2666.9030003@xxxxxxx/

In summary: originally this inherited from other arch code that did simply strip __GFP_COMP; that was deemed questionable because of the nonsensical comment about CONFIG_HUGETLBFS that was stuck to it; the current code is like it is because in 5 and a half years nobody said that it's wrong :)

If there actually *are* good reasons for stripping __GFP_COMP, then I've certainly no objection to doing so.

Robin.

On Thu, Jun 18, 2020 at 10:40:26PM -0400, Andrea Arcangeli wrote:
Hello,

On Thu, Jun 18, 2020 at 06:14:49PM -0700, Roman Gushchin wrote:
I agree. The whole

	page = alloc_pages_node(nid, alloc_flags, order);
	if (!page)
		continue;
	if (!order)
		break;
	if (!PageCompound(page)) {
		split_page(page, order);
		break;
	} else if (!split_huge_page(page)) {
		break;
	}

looks very suspicious to me.
My wild guess is that gfp flags changed somewhere above, so we hit
the branch which was never hit before.

Right to be suspicious about the above: split_huge_page on a regular
page allocated by a driver was never meant to work.

The PageLocked BUG_ON is just a symptom of a bigger issue, basically
split_huge_page it may survive, but it'll stay compound and in turn it
must be freed as compound.

The respective free method doesn't even contemplate freeing compound
pages, the only way the free method can survive, is by removing
__GFP_COMP forcefully in the allocation that was perhaps set here
(there are that many __GFP_COMP in that directory):

static void snd_malloc_dev_pages(struct snd_dma_buffer *dmab, size_t size)
{
	gfp_t gfp_flags;

	gfp_flags = GFP_KERNEL
		| __GFP_COMP	/* compound page lets parts be mapped */

And I'm not sure what the comment means here, compound or non compound
doesn't make a difference when you map it, it's not a THP, the
mappings must be handled manually so nothing should check PG_compound
anyway in the mapping code.

Something like this may improve things, it's an untested quick hack,
but this assumes it's always a bug to setup a compound page for these
DMA allocations and given the API it's probably a correct
assumption.. Compound is slower, unless you need it, you can avoid it
and then split_page will give contiguous memory page granular. Ideally
the code shouldn't call split_page at all and it should free it all at
once by keeping track of the order and by returning the order to the
caller, something the API can't do right now as it returns a plain
array that can only represent individual small pages.

Once this is resolved, you may want to check your config, iommu passthrough
sounds more optimal for a soundcard.

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index f68a62c3c32b..3dfbc010fa83 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -499,6 +499,10 @@ static struct page **__iommu_dma_alloc_pages(struct device *dev,
/* IOMMU can map any pages, so himem can also be used here */
  	gfp |= __GFP_NOWARN | __GFP_HIGHMEM;
+	if (unlikely(gfp & __GFP_COMP)) {
+		WARN();
+		gfp &= ~__GFP_COMP;
+	}
while (count) {
  		struct page *page = NULL;
@@ -522,13 +526,8 @@ static struct page **__iommu_dma_alloc_pages(struct device *dev,
  				continue;
  			if (!order)
  				break;
-			if (!PageCompound(page)) {
-				split_page(page, order);
-				break;
-			} else if (!split_huge_page(page)) {
-				break;
-			}
-			__free_pages(page, order);
+			split_page(page, order);
+			break;
  		}
  		if (!page) {
  			__iommu_dma_free_pages(pages, i);
diff --git a/sound/core/memalloc.c b/sound/core/memalloc.c
index 6850d13aa98c..378f5a36ec5f 100644
--- a/sound/core/memalloc.c
+++ b/sound/core/memalloc.c
@@ -28,7 +28,6 @@ static void snd_malloc_dev_pages(struct snd_dma_buffer *dmab, size_t size)
  	gfp_t gfp_flags;
gfp_flags = GFP_KERNEL
-		| __GFP_COMP	/* compound page lets parts be mapped */
  		| __GFP_NORETRY /* don't trigger OOM-killer */
  		| __GFP_NOWARN; /* no stack trace print - this call is non-critical */
  	dmab->area = dma_alloc_coherent(dmab->dev.dev, size, &dmab->addr,




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux