On Wed, Jun 13, 2012 at 10:48:40PM +0800, Fengguang Wu wrote: > > This really feels like we're papering over the problem. > > That's true. The majority users probably don't want to cache 100s > worth of data in memory. It may be worthwhile to add a new per-bdi > limit whose unit is number-of-seconds (of dirty data). Doesn't work. You have a BBWC that takes in 500MB of random 4k writes in a second, then starts to flush and needs to do a RMW cycle for every 4k write it cached. On RAID5/6, the flush rate will be about 100 IOPS, so it could take half an hour to flush those writes that took a second to dump into the cache. IO for that entire half hour will be extremely slow, and if you isue a sync during it, then that's when you get a hung task timer. Limiting the amount of writeback to a few seconds of IO simply won't fix this - the ingest rate of BBWCs is simply too great to prevent such events by a slow moving bandwidth throttle.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html