Re: [PATCH] mm: compaction beware writeback

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Mar 19, 2011 at 11:27:38PM -0700, Hugh Dickins wrote:
> I notice there's a Bug 31142 "Large write to USB stick freezes"
> discussion happening (which I've not digested), for which Andrea
> is proposing a patch which reminds me of this one.  Thought I'd
> better throw this into the mix for consideration.

With regard to that bug:

https://bugzilla.kernel.org/attachment.cgi?id=51262

there's no sign of __wait_on_bit called by migrate_pages. I think the
problem there are the ->writepage run on udf to writeout dirty pages
(to retry migration in the next iteration of the loop when the page
isn't dirty anymore), which hopefully it's solved by the patch I sent.

> D  loop0:
> schedule +0x670
> io_schedule +0x50
> sync_page +0x84
> __wait_on_bit +0x90
> wait_on_page_bit +0xa4
> unmap_and_move +0x180
> migrate_pages +0xbc
> compact_zone +0xbc
> compact_zone_order +0xc8
> try_to_compact_pages +0x104
> __alloc_pages_direct_compact +0xc0
> __alloc_pages_nodemask +0x68c
> allocate_slab +0x84
> new_slab +0x58
> __slab_alloc +0x1ec
> kmem_cache_alloc +0x7c
> radix_tree_preload +0x94
> add_to_page_cache_locked +0x78
> shmem_getpage +0x208
> pagecache_write_begin +0x2c
> do_lo_send_aops +0xc0
> do_bio_filebacked +0x11c
> loop_thread +0x204
> kthread +0xac
> kernel_thread +0x54

So your patch will avoid waiting above in writeback if called by
direct compaction like above, agreed (currently we only avoid to call
lock_page but we could still wait in wait_on_page_writeback if force=1
goes on at the third pass of the migrate_pages loop). This seems a
separate issue to the last trace posted in bug 31142 as there was no
wait_on_bit in that last trace.

Interesting that slab allocates with order > 0 an object that is <4096
bytes. Is this related to slab_break_gfp_order? The comment talks
about fragmentation resistance in low memory, if that's the same
fragmentation that anti-frag solves, is this logic still actual? I
guess we still need it for cache coloring. However I noticed there's
no fallback to that, so once kmem_cache_create decided that an optimal
gfporder is > 0 (even if it doesn't need to be > 0) if the order 1
allocation fails, kmem_cache_alloc will fail too without fallback to
order 0. It could be more reliable than that.

A large allocation if nothing else reduces the frequency of gfp calls.

> So I've extended your earlier PF_MEMALLOC patch to prevent waiting for
> writeback as well as waiting for pagelock.  And I've never seen the
> hang again since putting this patch in.
> 
> Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx>
> ---

This change looks good to me, it will make compaction a little less
reliable with regard to writeback. You are still going to execute
->writepage on dirty pages though (except for __GFP_NO_KSWAPD
allocations that signals the fact they don't need to be reliable with
my last patch for bug 31142 and foces sync=0 and have that sync
parameter checked before calling ->writepage too).

compaction usually has an huge amount of "source" memory to move, in
the destination space, so it's not so good that migration is so
aggressive and blocks when there may be another couple of contiguous
source page to move without waiting. Ideally we should move to the
next block of source pages in compaction, before setting force=1 in a
final pass. We insist a lot in migrate_pages when breaking the loop
after 1 pass could be enough, and then it should be compaction that if
it fails migration on all candidate source pages, tries an "harder
migrate" in a second pass. We could achieve that by having compaction
twiddle with the sync bit instead of the caller and only run one pass
of migrate when sync = 0. That's a bit larger change though and it may
consume more cpu in compaction (compaction loop not so cheap), so I'm
unsure if it would be an overall improvement.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]