On Sat, Oct 15, 2011 at 12:00:47AM +0800, Wu Fengguang wrote: > On Fri, Oct 14, 2011 at 04:18:35AM +0800, Jan Kara wrote: > > On Thu 13-10-11 22:39:39, Wu Fengguang wrote: > > > > > + long pause = 1; > > > > > + long max_pause = dirty_writeback_interval ? > > > > > + msecs_to_jiffies(dirty_writeback_interval * 10) : > > > > > + HZ; > > > > > > > > It's better not to put the flusher to sleeps more than 10ms, so that > > > > when the condition changes, we don't risk making the storage idle for > > > > too long time. > > > > > > Yeah, the one big regression case > > > > > > 3.1.0-rc8-ioless6a+ 3.1.0-rc8-ioless6-requeue6+ > > > ------------------------ ------------------------ > > > 47.07 -15.5% 39.78 thresh=1M/xfs-2dd-4k-8p-4096M-1M:10-X > > > > > > is exactly caused by the large sleep: the attached graphs are showing > > > one period of no-progress on the number of written pages. > > Thanks for the tests! Interesting. Do you have trace file from that run? > > I see the writeback stalled for 20s or so which is more than > > dirty_writeback_centisecs so I think something more complicated must have > > happened. > > I noticed that > > 1) the global dirty limit is exceeded (dirty=286, limit=256), hence > the dd tasks are hard blocked in balance_dirty_pages(). > > flush-8:0-1170 [004] 211.068427: global_dirty_state: dirty=286 writeback=0 unstable=0 bg_thresh=128 thresh=256 limit=256 dirtied=2084879 written=2081447 > > 2) the flusher thread is not woken up because we test writeback_in_progress() > in balance_dirty_pages(). > > if (unlikely(!writeback_in_progress(bdi))) > bdi_start_background_writeback(bdi); > > Thus the flusher thread wait and wait as in below trace. > > flush-8:0-1170 [004] 211.068427: global_dirty_state: dirty=286 writeback=0 unstable=0 bg_thresh=128 thresh=256 limit=256 dirtied=2084879 written=2081447 > flush-8:0-1170 [004] 211.068428: task_io: read=9216 write=12873728 cancelled_write=0 nr_dirtied=0 nr_dirtied_pause=32 > flush-8:0-1170 [004] 211.068428: writeback_start: bdi 8:0: sb_dev 0:0 nr_pages=9223372036854774848 sync_mode=0 kupdate=0 range_cyclic=1 background=1 reason=background > flush-8:0-1170 [004] 211.068440: writeback_single_inode: bdi 8:0: ino=131 state=I_DIRTY_SYNC dirtied_when=4294869658 age=9 index=0 to_write=1024 wrote=0 > flush-8:0-1170 [004] 211.068442: writeback_written: bdi 8:0: sb_dev 0:0 nr_pages=9223372036854774848 sync_mode=0 kupdate=0 range_cyclic=1 background=1 reason=background > flush-8:0-1170 [004] 211.068443: writeback_wait: bdi 8:0: sb_dev 0:0 nr_pages=9223372036854774848 sync_mode=0 kupdate=0 range_cyclic=1 background=1 reason=background > > flush-8:0-1170 [004] 213.110122: global_dirty_state: dirty=286 writeback=0 unstable=0 bg_thresh=128 thresh=256 limit=256 dirtied=2084879 written=2081447 > flush-8:0-1170 [004] 213.110126: task_io: read=9216 write=12873728 cancelled_write=0 nr_dirtied=0 nr_dirtied_pause=32 > flush-8:0-1170 [004] 213.110126: writeback_start: bdi 8:0: sb_dev 0:0 nr_pages=9223372036854774848 sync_mode=0 kupdate=0 range_cyclic=1 background=1 reason=background > flush-8:0-1170 [004] 213.110134: writeback_single_inode: bdi 8:0: ino=131 state=I_DIRTY_SYNC dirtied_when=4294869658 age=11 index=0 to_write=1024 wrote=0 > flush-8:0-1170 [004] 213.110135: writeback_written: bdi 8:0: sb_dev 0:0 nr_pages=9223372036854774848 sync_mode=0 kupdate=0 range_cyclic=1 background=1 reason=background > flush-8:0-1170 [004] 213.110135: writeback_wait: bdi 8:0: sb_dev 0:0 nr_pages=9223372036854774848 sync_mode=0 kupdate=0 range_cyclic=1 background=1 reason=background > > flush-8:0-1170 [004] 217.193470: global_dirty_state: dirty=286 writeback=0 unstable=0 bg_thresh=128 thresh=256 limit=256 dirtied=2084879 written=2081447 > flush-8:0-1170 [004] 217.193471: task_io: read=9216 write=12873728 cancelled_write=0 nr_dirtied=0 nr_dirtied_pause=32 > flush-8:0-1170 [004] 217.193471: writeback_start: bdi 8:0: sb_dev 0:0 nr_pages=9223372036854774848 sync_mode=0 kupdate=0 range_cyclic=1 background=1 reason=background > flush-8:0-1170 [004] 217.193483: writeback_single_inode: bdi 8:0: ino=131 state=I_DIRTY_SYNC dirtied_when=4294869658 age=15 index=0 to_write=1024 wrote=0 > flush-8:0-1170 [004] 217.193485: writeback_written: bdi 8:0: sb_dev 0:0 nr_pages=9223372036854774848 sync_mode=0 kupdate=0 range_cyclic=1 background=1 reason=background It's still puzzling why dirty pages remain at 286 and does not get cleaned by either flusher threads for local XFS and NFSROOT for so long time.. Thanks, Fengguang -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html