Sorry, attached is the "separate ACCOUNTING from THROTTLING" patch. > It's very possible to throttle meta data READS/WRITES, as long as they > can be attributed to the original task (assuming task oriented throttling > instead of bio/request oriented). > > The trick is to separate the concepts of THROTTLING and ACCOUNTING. > You can ACCOUNT data and meta data reads/writes to the right task, and > only to THROTTLE the task when it's doing data reads/writes. > > FYI I played the same trick for balance_dirty_pages_ratelimited() for > another reason: _accurate_ accounting of dirtied pages. > > That trick should play well with most applications who do interleaved > data and meta data reads/writes. For the special case of "find" who > does pure meta data reads, we can still throttle it by playing another > trick: to THROTTLE meta data reads/writes with a much higher threshold > than that of data. So normal applications will be almost always be > throttled at data accesses while "find" will be throttled at meta data > accesses. > > For a real example of how it works, you can check this patch (plus the > attached one) > > writeback: IO-less balance_dirty_pages() > http://git.kernel.org/?p=linux/kernel/git/wfg/writeback.git;a=commitdiff;h=e0de5e9961eeb992f305e877c5ef944fcd7a4269;hp=992851d56d79d227beaba1e4dcc657cbcf815556 > > Where tsk->nr_dirtied does dirty ACCOUNTING and tsk->nr_dirtied_pause > is the threshold for THROTTLING. When > > tsk->nr_dirtied > tsk->nr_dirtied_pause > > The task will voluntarily enter balance_dirty_pages() for taking a > nap (pause time will be proportional to tsk->nr_dirtied), and when > finished, start a new account-and-throttle period by resetting > tsk->nr_dirtied and possibly adjust tsk->nr_dirtied_pause for a more > reasonable pause time at next sleep. > > BTW, I'd like to advocate balance_dirty_pages() based IO controller :) > > As you may have noticed, it's not all that hard: the main functions > blkcg_update_bandwidth()/blkcg_update_dirty_ratelimit() can fit nicely > in one screen! > > writeback: async write IO controllers > http://git.kernel.org/?p=linux/kernel/git/wfg/writeback.git;a=commitdiff;h=1a58ad99ce1f6a9df6618a4b92fa4859cc3e7e90;hp=5b6fcb3125ea52ff04a2fad27a51307842deb1a0 > > Thanks, > Fengguang
Subject: writeback: accurately account dirtied pages Date: Thu Apr 14 07:52:37 CST 2011 Signed-off-by: Wu Fengguang <fengguang.wu@xxxxxxxxx> --- mm/page-writeback.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) --- linux-next.orig/mm/page-writeback.c 2011-04-16 11:28:41.000000000 +0800 +++ linux-next/mm/page-writeback.c 2011-04-16 11:28:41.000000000 +0800 @@ -1352,8 +1352,6 @@ void balance_dirty_pages_ratelimited_nr( if (!bdi_cap_account_dirty(bdi)) return; - current->nr_dirtied += nr_pages_dirtied; - if (dirty_exceeded_recently(bdi, MAX_PAUSE)) { unsigned long max = current->nr_dirtied + (128 >> (PAGE_SHIFT - 10)); @@ -1819,6 +1817,7 @@ void account_page_dirtied(struct page *p __inc_bdi_stat(mapping->backing_dev_info, BDI_DIRTIED); task_dirty_inc(current); task_io_account_write(PAGE_CACHE_SIZE); + current->nr_dirtied++; } } EXPORT_SYMBOL(account_page_dirtied);