On Wed, Jun 16, 2010 at 01:08:00AM +1000, Nick Piggin wrote: > On Tue, Jun 15, 2010 at 03:51:34PM +0100, Mel Gorman wrote: > > On Tue, Jun 15, 2010 at 04:00:11PM +0200, Andrea Arcangeli wrote: > > > When memory pressure is low, not going into ->writepage may be > > > beneficial from latency prospective too. (but again it depends how > > > much it matters to go in LRU and how beneficial is the cache, to know > > > if it's worth taking clean cache away even if hotter than dirty cache) > > > > > > About the stack overflow did you ever got any stack-debug error? > > > > Not an error. Got a report from Dave Chinner though and it's what kicked > > off this whole routine in the first place. I've been recording stack > > usage figures but not reporting them. In reclaim I'm getting to about 5K > > deep but this was on simple storage and XFS was ignoring attempts for > > reclaim to writeback. > > > > http://lkml.org/lkml/2010/4/13/121 > > > > Here is one my my own stack traces though > > > > Depth Size Location (49 entries) > > ----- ---- -------- > > 0) 5064 304 get_page_from_freelist+0x2e4/0x722 > > 1) 4760 240 __alloc_pages_nodemask+0x15f/0x6a7 > > 2) 4520 48 kmem_getpages+0x61/0x12c > > 3) 4472 96 cache_grow+0xca/0x272 > > 4) 4376 80 cache_alloc_refill+0x1d4/0x226 > > 5) 4296 64 kmem_cache_alloc+0x129/0x1bc > > 6) 4232 16 mempool_alloc_slab+0x16/0x18 > > 7) 4216 144 mempool_alloc+0x56/0x104 > > 8) 4072 16 scsi_sg_alloc+0x48/0x4a [scsi_mod] > > 9) 4056 96 __sg_alloc_table+0x58/0xf8 > > 10) 3960 32 scsi_init_sgtable+0x37/0x8f [scsi_mod] > > 11) 3928 32 scsi_init_io+0x24/0xce [scsi_mod] > > 12) 3896 48 scsi_setup_fs_cmnd+0xbc/0xc4 [scsi_mod] > > 13) 3848 144 sd_prep_fn+0x1d3/0xc13 [sd_mod] > > 14) 3704 64 blk_peek_request+0xe2/0x1a6 > > 15) 3640 96 scsi_request_fn+0x87/0x522 [scsi_mod] > > 16) 3544 32 __blk_run_queue+0x88/0x14b > > 17) 3512 48 elv_insert+0xb7/0x254 > > 18) 3464 48 __elv_add_request+0x9f/0xa7 > > 19) 3416 128 __make_request+0x3f4/0x476 > > 20) 3288 192 generic_make_request+0x332/0x3a4 > > 21) 3096 64 submit_bio+0xc4/0xcd > > 22) 3032 80 _xfs_buf_ioapply+0x222/0x252 [xfs] > > 23) 2952 48 xfs_buf_iorequest+0x84/0xa1 [xfs] > > 24) 2904 32 xlog_bdstrat+0x47/0x4d [xfs] > > 25) 2872 64 xlog_sync+0x21a/0x329 [xfs] > > 26) 2808 48 xlog_state_release_iclog+0x9b/0xa8 [xfs] > > 27) 2760 176 xlog_write+0x356/0x506 [xfs] > > 28) 2584 96 xfs_log_write+0x5a/0x86 [xfs] > > 29) 2488 368 xfs_trans_commit_iclog+0x165/0x2c3 [xfs] > > 30) 2120 80 _xfs_trans_commit+0xd8/0x20d [xfs] > > 31) 2040 240 xfs_iomap_write_allocate+0x247/0x336 [xfs] > > 32) 1800 144 xfs_iomap+0x31a/0x345 [xfs] > > 33) 1656 48 xfs_map_blocks+0x3c/0x40 [xfs] > > 34) 1608 256 xfs_page_state_convert+0x2c4/0x597 [xfs] > > 35) 1352 64 xfs_vm_writepage+0xf5/0x12f [xfs] > > 36) 1288 32 __writepage+0x17/0x34 > > 37) 1256 288 write_cache_pages+0x1f3/0x2f8 > > 38) 968 16 generic_writepages+0x24/0x2a > > 39) 952 64 xfs_vm_writepages+0x4f/0x5c [xfs] > > 40) 888 16 do_writepages+0x21/0x2a > > 41) 872 48 writeback_single_inode+0xd8/0x2f4 > > 42) 824 112 writeback_inodes_wb+0x41a/0x51e > > 43) 712 176 wb_writeback+0x13d/0x1b7 > > 44) 536 128 wb_do_writeback+0x150/0x167 > > 45) 408 80 bdi_writeback_task+0x43/0x117 > > 46) 328 48 bdi_start_fn+0x76/0xd5 > > 47) 280 96 kthread+0x82/0x8a > > 48) 184 184 kernel_thread_helper+0x4/0x10 > > > > XFS as you can see is quite deep there. Now consider if > > get_page_from_freelist() there had entered direct reclaim and then tried > > to writeback a page. That's the problem that is being worried about. > > It would be a problem because it should be !__GFP_IO at that point so > something would be seriously broken if it called ->writepage again. > True, ignore this as Christoph's example makes more sense. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html