On Tue 19-04-11 10:34:23, Vivek Goyal wrote: > On Tue, Apr 19, 2011 at 10:17:17PM +0800, Wu Fengguang wrote: > > [snip] > > > > > > For throttling case, apart from metadata, I found that with simple > > > > > > throttling of data I ran into issues with journalling with ext4 mounuted > > > > > > in ordered mode. So it was suggested that WRITE IO throttling should > > > > > > not be done at device level instead try to do it in higher layers, > > > > > > possibly balance_dirty_pages() and throttle process early. > > > > > > > > > > The problem with doing it at the page cache entry level is that > > > > > cache hits then get throttled. It's not really a an IO controller at > > > > > that point, and the impact on application performance could be huge > > > > > (i.e. MB/s instead of GB/s). > > > > > > > > Agreed that throttling cache hits is not a good idea. Can we determine > > > > if page being asked for is in cache or not and charge for IO accordingly. > > > > > > You'd need hooks in find_or_create_page(), though you have no > > > context of whether a read or a write is in progress at that point. > > > > I'm confused. Where is the throttling at cache hits? > > > > The balance_dirty_pages() throttling kicks in at write() syscall and > > page fault time. For example, generic_perform_write(), do_wp_page() > > and __do_fault() will explicitly call > > balance_dirty_pages_ratelimited() to do the write throttling. > > This comment was in the context of what if we move block IO controller read > throttling also in higher layers. Then we don't want to throttle reads > which are already in cache. > > Currently throttling hook is in generic_make_request() and it kicks in > only if data is not present in page cache and actual disk IO is initiated. You can always throttle in readpage(). It's not much higher than generic_make_request() but basically as high as it can get I suspect (otherwise you'd have to deal with lots of different code paths like page faults, splice, read, ...). Honza -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html