On Thu, Oct 08, 2009 at 01:33:35PM +0800, Wu Fengguang wrote: > On Wed, Oct 07, 2009 at 11:18:22PM +0800, Wu Fengguang wrote: > > On Wed, Oct 07, 2009 at 09:47:14PM +0800, Peter Staubach wrote: > > > > > > > # vmmon -d 1 nr_writeback nr_dirty nr_unstable # (per 1-second samples) > > > > nr_writeback nr_dirty nr_unstable > > > > 11227 41463 38044 > > > > 11227 41463 38044 > > > > 11227 41463 38044 > > > > 11227 41463 38044 > > > > I guess in the above 4 seconds, either client or (more likely) server > > is blocked. A blocked server cannot send ACKs to knock down both > > Yeah the server side is blocked. The nfsd are mostly blocked in > generic_file_aio_write(), in particular, the i_mutex lock! I'm copying > one or two big files over NFS, so the i_mutex lock is heavily contented. > > I'm using the default wsize=4096 for NFS-root.. Just switched to 512k wsize, and things improved: in most time the 8 nfsd are not all blocked. However, the bumpiness still remains: nr_writeback nr_dirty nr_unstable 11105 58080 15042 11105 58080 15042 11233 54583 18626 11101 51964 22036 11105 51978 22065 11233 52362 22577 10985 58538 13500 11233 53748 19721 11047 51999 21778 11105 50262 23572 11105 50262 20441 10985 52772 20721 10977 52109 21516 11105 48296 26629 11105 48296 26629 10985 52191 21042 11166 51456 22296 10980 50681 24466 11233 45352 30488 11233 45352 30488 11105 45475 30616 11131 45313 20355 11233 51126 22637 11233 51126 22637 wfg ~% ps -o pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:24,comm ax|g nfs 329 329 TS - -5 24 1 0.0 S< worker_thread nfsiod 4690 4690 TS - -5 24 1 0.1 S< svc_recv nfsd 4691 4691 TS - -5 24 0 0.1 S< svc_recv nfsd 4692 4692 TS - -5 24 0 0.1 R< ? nfsd 4693 4693 TS - -5 24 0 0.1 S< svc_recv nfsd 4694 4694 TS - -5 24 0 0.1 S< svc_recv nfsd 4695 4695 TS - -5 24 0 0.1 S< svc_recv nfsd 4696 4696 TS - -5 24 0 0.1 S< svc_recv nfsd 4697 4697 TS - -5 24 0 0.1 R< ? nfsd wfg ~% ps -o pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:24,comm ax|g nfs 329 329 TS - -5 24 1 0.0 S< worker_thread nfsiod 4690 4690 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd 4691 4691 TS - -5 24 0 0.1 S< svc_recv nfsd 4692 4692 TS - -5 24 1 0.1 D< log_wait_commit nfsd 4693 4693 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd 4694 4694 TS - -5 24 0 0.1 S< svc_recv nfsd 4695 4695 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd 4696 4696 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd 4697 4697 TS - -5 24 1 0.1 D< generic_file_aio_write nfsd wfg ~% ps -o pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:24,comm ax|g nfs 329 329 TS - -5 24 1 0.0 S< worker_thread nfsiod 4690 4690 TS - -5 24 1 0.1 S< svc_recv nfsd 4691 4691 TS - -5 24 0 0.1 S< svc_recv nfsd 4692 4692 TS - -5 24 1 0.1 R< ? nfsd 4693 4693 TS - -5 24 1 0.1 R< ? nfsd 4694 4694 TS - -5 24 1 0.1 R< ? nfsd 4695 4695 TS - -5 24 1 0.1 S< svc_recv nfsd 4696 4696 TS - -5 24 0 0.1 S< svc_recv nfsd 4697 4697 TS - -5 24 1 0.1 S< svc_recv nfsd wfg ~% ps -o pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:24,comm ax|g nfs 329 329 TS - -5 24 1 0.0 S< worker_thread nfsiod 4690 4690 TS - -5 24 1 0.1 D< generic_file_aio_write nfsd 4691 4691 TS - -5 24 0 0.1 S< svc_recv nfsd 4692 4692 TS - -5 24 1 0.1 D< nfsd_sync nfsd 4693 4693 TS - -5 24 1 0.1 D< sync_buffer nfsd 4694 4694 TS - -5 24 1 0.1 D< generic_file_aio_write nfsd 4695 4695 TS - -5 24 1 0.1 S< svc_recv nfsd 4696 4696 TS - -5 24 0 0.1 S< svc_recv nfsd 4697 4697 TS - -5 24 1 0.1 D< generic_file_aio_write nfsd Thanks, Fengguang > wfg ~% ps -o pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:24,comm ax|g nfs > 329 329 TS - -5 24 1 0.0 S< worker_thread nfsiod > 4690 4690 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd > 4691 4691 TS - -5 24 0 0.0 D< generic_file_aio_write nfsd > 4692 4692 TS - -5 24 0 0.0 D< generic_file_aio_write nfsd > 4693 4693 TS - -5 24 0 0.0 D< generic_file_aio_write nfsd > 4694 4694 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd > 4695 4695 TS - -5 24 1 0.1 D< generic_file_aio_write nfsd > 4696 4696 TS - -5 24 1 0.0 D< log_wait_commit nfsd > 4697 4697 TS - -5 24 0 0.0 D< generic_file_aio_write nfsd > wfg ~% ps -o pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:24,comm ax|g nfs > 329 329 TS - -5 24 1 0.0 S< worker_thread nfsiod > 4690 4690 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd > 4691 4691 TS - -5 24 0 0.0 D< generic_file_aio_write nfsd > 4692 4692 TS - -5 24 0 0.0 D< generic_file_aio_write nfsd > 4693 4693 TS - -5 24 0 0.0 D< sync_buffer nfsd > 4694 4694 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd > 4695 4695 TS - -5 24 1 0.1 D< generic_file_aio_write nfsd > 4696 4696 TS - -5 24 1 0.0 D< generic_file_aio_write nfsd > 4697 4697 TS - -5 24 0 0.0 D< generic_file_aio_write nfsd > > wfg ~% ps -o pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:24,comm ax|g nfs > 329 329 TS - -5 24 1 0.0 S< worker_thread nfsiod > 4690 4690 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd > 4691 4691 TS - -5 24 0 0.1 D< get_request_wait nfsd > 4692 4692 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd > 4693 4693 TS - -5 24 0 0.1 S< svc_recv nfsd > 4694 4694 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd > 4695 4695 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd > 4696 4696 TS - -5 24 0 0.1 S< svc_recv nfsd > 4697 4697 TS - -5 24 1 0.1 D< generic_file_aio_write nfsd > > wfg ~% ps -o pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:24,comm ax|g nfs > 329 329 TS - -5 24 1 0.0 S< worker_thread nfsiod > 4690 4690 TS - -5 24 1 0.1 D< get_write_access nfsd > 4691 4691 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd > 4692 4692 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd > 4693 4693 TS - -5 24 1 0.1 D< generic_file_aio_write nfsd > 4694 4694 TS - -5 24 1 0.1 D< get_write_access nfsd > 4695 4695 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd > 4696 4696 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd > 4697 4697 TS - -5 24 0 0.1 D< generic_file_aio_write nfsd > > Thanks, > Fengguang > > > nr_writeback/nr_unstable. And the stuck nr_writeback will freeze > > nr_dirty as well, because the dirtying process is throttled until > > it receives enough "PG_writeback cleared" event, however the bdi-flush > > thread is also blocked when trying to clear more PG_writeback, because > > the client side nr_writeback limit has been reached. In summary, > > > > server blocked => nr_writeback stuck => nr_writeback limit reached > > => bdi-flush blocked => no end_page_writeback() => dirtier blocked > > => nr_dirty stuck > > > > Thanks, > > Fengguang > > > > > > 11045 53987 6490 > > > > 11033 53120 8145 > > > > 11195 52143 10886 > > > > 11211 52144 10913 > > > > 11211 52144 10913 > > > > 11211 52144 10913 > > > > > > > > btrfs seems to maintain a private pool of writeback pages, which can go out of > > > > control: > > > > > > > > nr_writeback nr_dirty > > > > 261075 132 > > > > 252891 195 > > > > 244795 187 > > > > 236851 187 > > > > 228830 187 > > > > 221040 218 > > > > 212674 237 > > > > 204981 237 > > > > > > > > XFS has very interesting "bumpy writeback" behavior: it tends to wait > > > > collect enough pages and then write the whole world. > > > > > > > > nr_writeback nr_dirty > > > > 80781 0 > > > > 37117 37703 > > > > 37117 43933 > > > > 81044 6 > > > > 81050 0 > > > > 43943 10199 > > > > 43930 36355 > > > > 43930 36355 > > > > 80293 0 > > > > 80285 0 > > > > 80285 0 > > > > > > > > Thanks, > > > > Fengguang > > > > > > > > -- > > > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > > > the body of a message to majordomo@xxxxxxxxxxxxxxx > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html