On Tue, 2010-06-15 at 15:41 -0700, Andrew Morton wrote: > (switched to email. Please respond via emailed reply-to-all, not via the > bugzilla web interface). > > (switching back to email, actually) > > On Sun, 13 Jun 2010 13:01:57 GMT > bugzilla-daemon@xxxxxxxxxxxxxxxxxxx wrote: > > > https://bugzilla.kernel.org/show_bug.cgi?id=16148 > > > > > > Mikko C. <mikko.cal@xxxxxxxxx> changed: > > > > What |Removed |Added > > ---------------------------------------------------------------------------- > > CC| |mikko.cal@xxxxxxxxx > > > > > > > > > > --- Comment #8 from Mikko C. <mikko.cal@xxxxxxxxx> 2010-06-13 13:01:53 --- > > I have been getting this with 2.6.35-rc2 and rc3. > > Could it be the same problem? > > > > > > X: page allocation failure. order:0, mode:0x4 > > Pid: 1514, comm: X Not tainted 2.6.35-rc3 #1 > > Call Trace: > > [<ffffffff8108ce49>] ? __alloc_pages_nodemask+0x629/0x680 > > [<ffffffff8108c920>] ? __alloc_pages_nodemask+0x100/0x680 > > [<ffffffffa00db8f3>] ? ttm_get_pages+0x2c3/0x448 [ttm] > > [<ffffffffa00d4658>] ? __ttm_tt_get_page+0x98/0xc0 [ttm] > > [<ffffffffa00d4988>] ? ttm_tt_populate+0x48/0x90 [ttm] > > [<ffffffffa00d4a26>] ? ttm_tt_bind+0x56/0xa0 [ttm] > > [<ffffffffa00d5230>] ? ttm_bo_handle_move_mem+0x1d0/0x430 [ttm] > > [<ffffffffa00d76d6>] ? ttm_bo_move_buffer+0x166/0x180 [ttm] > > [<ffffffffa00b9736>] ? drm_mm_kmalloc+0x26/0xc0 [drm] > > [<ffffffff81030ea9>] ? get_parent_ip+0x9/0x20 > > [<ffffffffa00d7786>] ? ttm_bo_validate+0x96/0x130 [ttm] > > [<ffffffffa00d7b35>] ? ttm_bo_init+0x315/0x390 [ttm] > > [<ffffffffa0122eb8>] ? radeon_bo_create+0x118/0x210 [radeon] > > [<ffffffffa0122fb0>] ? radeon_ttm_bo_destroy+0x0/0xb0 [radeon] > > [<ffffffffa013704c>] ? radeon_gem_object_create+0x8c/0x110 [radeon] > > [<ffffffffa013711f>] ? radeon_gem_create_ioctl+0x4f/0xe0 [radeon] > > [<ffffffffa00b10e6>] ? drm_ioctl+0x3d6/0x470 [drm] > > [<ffffffffa01370d0>] ? radeon_gem_create_ioctl+0x0/0xe0 [radeon] > > [<ffffffff810b965f>] ? do_sync_read+0xbf/0x100 > > [<ffffffff810c8965>] ? vfs_ioctl+0x35/0xd0 > > [<ffffffff810c8b28>] ? do_vfs_ioctl+0x88/0x530 > > [<ffffffff81031ed7>] ? sub_preempt_count+0x87/0xb0 > > [<ffffffff810c9019>] ? sys_ioctl+0x49/0x80 > > [<ffffffff810ba4fe>] ? sys_read+0x4e/0x90 > > [<ffffffff810024ab>] ? system_call_fastpath+0x16/0x1b > > That's different. ttm_get_pages() looks pretty busted to me. It's not > using __GFP_WAIT and it's not using __GFP_FS. It's using a plain > GFP_DMA32 so it's using atomic allocations even though it doesn't need > to. IOW, it's shooting itself in the head. > > Given that it will sometimes use GFP_HIGHUSER which includes __GFP_FS > and __GFP_WAIT, I assume it can always include __GFP_FS and __GFP_WAIT. > If so, it should very much do so. If not then the function is > misdesigned and should be altered to take a gfp_t argument so the > caller can tell ttm_get_pages() which is the strongest allocation mode > which it may use. > > > [TTM] Unable to allocate page. > > radeon 0000:01:05.0: object_init failed for (7827456, 0x00000002) > > [drm:radeon_gem_object_create] *ERROR* Failed to allocate GEM object (7827456, > > 2, 4096, -12) > > This bug actually broke stuff for you. > > Something like this: > > --- a/drivers/gpu/drm/ttm/ttm_page_alloc.c~a > +++ a/drivers/gpu/drm/ttm/ttm_page_alloc.c > @@ -677,7 +677,7 @@ int ttm_get_pages(struct list_head *page > /* No pool for cached pages */ > if (pool == NULL) { > if (flags & TTM_PAGE_FLAG_DMA32) > - gfp_flags |= GFP_DMA32; > + gfp_flags |= GFP_KERNEL|GFP_DMA32; > else > gfp_flags |= GFP_HIGHUSER; > > _ > > although I wonder whether it should be using pool->gfp_flags. > > > It's a shame that this code was developed and merged in secret :( Had > we known, we could have looked at enhancing mempools to cover the > requirement, or at implementing this in some generic fashion rather > than hiding it down in drivers/gpu/drm. Its been post to lkml at least once or twice over the past few years though not as much as it was posted to dri-devel, but that was because we had never seen anyone show any interest in it outside of kernel hackers. Originally I was going to use the generic allocator stuff ia64 uses for uncached allocations but it allocates memory ranges not pages so it wasn't useful. I also suggested getting a page flag for uncached allocator stuff, I was told to go write the code in my own corner and prove it was required. So I did, and it was cleaned up and worked on by others and I merged it. So can we lay off with the "in secret", the original code is nearly 2 years old at this point and just because -mm hackers choose to ignore it isn't our fault. Patches welcome. So now back to the bug: So the page pools are setup with gfp flags, in the normal case, 4 pools, one WC GFP_HIGHUSER pages, one UC HIGHUSER pages, one WC GFP_USER| GFP_DMA32, one UC GFP_USER|GFP_DMA32, so the pools are all fine, the problem here is the same as before we added the pools, which is the normal page allocation path, which needs the GFP_USER added instead of GFP_KERNEL. That said I've noticed a lot more page allocation failure reports in 2.6.35-rcX than we've gotten for a long time, in code that hasn't changed (the AGP ones the other day for example) has something in the core MM regressed (again... didn't this happen back in 2.6.31 or something). (cc'ing Mel who tracked these down before). Dave. _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel