Re: kernel bug(VM_BUG_ON_PAGE) with 3.18.13 in mm/migrate.c

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Mel,

On Thu, May 28, 2015 at 5:00 AM, Mel Gorman <mgorman@xxxxxxx> wrote:
> On Wed, May 27, 2015 at 11:05:33AM -0700, Jovi Zhangwei wrote:
>> Hi,
>>
>> I got below kernel bug error in our 3.18.13 stable kernel.
>> "kernel BUG at mm/migrate.c:1661!"
>>
>> Source code:
>>
>> 1657    static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
>> 1658   {
>> 1659            int page_lru;
>> 1660
>> 1661           VM_BUG_ON_PAGE(compound_order(page) &&
>> !PageTransHuge(page), page);
>>
>> It's easy to trigger the error by run tcpdump in our system.(not sure
>> it will easily be reproduced in another system)
>> "sudo tcpdump -i bond0.100 'tcp port 4242' -c 100000000000 -w 4242.pcap"
>>
>> Any comments for this bug would be great appreciated. thanks.
>>
>
> What sort of compound page is it? What sort of VMA is it in? hugetlbfs
> pages should never be tagged for NUMA migrate and never enter this
> path. Transparent huge pages are handled properly so I'm wondering
> exactly what type of compound page this is and what mapped it into
> userspace.
>
Thanks for your reply.

After reading net/packet/af_packet.c:alloc_one_pg_vec_page, I found
there indeed have compound page maped into userspace.

I sent a patch for this issue(you may received it), but not sure it's
right to fix,
feel free to update it or use your own patch.

Thanks.

--------------------------------------------------------------------------------------------

[PATCH] mm/migrate: Avoid migrate mmaped compound pages

Below kernel vm bug can be triggered by tcpdump which mmaped a lot of
pages with GFP_COMP flag.

[Mon May 25 05:29:33 2015] page:ffffea0015414000 count:66 mapcount:1
mapping:          (null) index:0x0
[Mon May 25 05:29:33 2015] flags: 0x20047580004000(head)
[Mon May 25 05:29:33 2015] page dumped because:
VM_BUG_ON_PAGE(compound_order(page) && !PageTransHuge(page))
[Mon May 25 05:29:33 2015] ------------[ cut here ]------------
[Mon May 25 05:29:33 2015] kernel BUG at mm/migrate.c:1661!
[Mon May 25 05:29:33 2015] invalid opcode: 0000 [#1] SMP

The fix is simply disallow migrate mmaped compound pages, return 0 instead of
report vm bug.

Signed-off-by: Jovi Zhangwei <jovi.zhangwei@xxxxxxxxx>
---
 mm/migrate.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index f53838f..839adef 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1606,7 +1606,8 @@ static int numamigrate_isolate_page(pg_data_t
*pgdat, struct page *page)
 {
  int page_lru;

- VM_BUG_ON_PAGE(compound_order(page) && !PageTransHuge(page), page);
+ if (compound_order(page) && !PageTransHuge(page))
+ return 0;

  /* Avoid migrating to a node that is nearly full */
  if (!migrate_balanced_pgdat(pgdat, 1UL << compound_order(page)))
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]