On Fri, Apr 08, 2011 at 09:42:49AM +1000, Dave Chinner wrote: > On Thu, Apr 07, 2011 at 03:24:24PM -0400, Vivek Goyal wrote: > > On Thu, Apr 07, 2011 at 09:36:02AM +1000, Dave Chinner wrote: > [...] > > > > When I_DIRTY is cleared, remove inode from bdi_memcg->b_dirty. Delete bdi_memcg > > > > if the list is now empty. > > > > > > > > balance_dirty_pages() calls mem_cgroup_balance_dirty_pages(memcg, bdi) > > > > if over bg limit, then > > > > set bdi_memcg->b_over_limit > > > > If there is no bdi_memcg (because all inodes of currentâs > > > > memcg dirty pages where first dirtied by other memcg) then > > > > memcg lru to find inode and call writeback_single_inode(). > > > > This is to handle uncommon sharing. > > > > > > We don't want to introduce any new IO sources into > > > balance_dirty_pages(). This needs to trigger memcg-LRU based bdi > > > flusher writeback, not try to write back inodes itself. > > > > Will we not enjoy more sequtial IO traffic once we find an inode by > > traversing memcg->lru list? So isn't that better than pure LRU based > > flushing? > > Sorry, I wasn't particularly clear there, What I meant was that we > ask the bdi-flusher thread to select the inode to write back from > the LRU, not do it directly from balance_dirty_pages(). i.e. > bdp stays IO-less. Agreed. Even with cgroup aware writeback, we use bdi-flusher threads to do writeback and no direct writeback in bdp. > > > > Alternatively, this problem won't exist if you transfer page Ñache > > > state from one memcg to another when you move the inode from one > > > memcg to another. > > > > But in case of shared inode problem still remains. inode is being written > > from two cgroups and it can't be in both the groups as per the exisiting > > design. > > But we've already determined that there is no use case for this > shared inode behaviour, so we aren't going to explictly support it, > right? Well, we are not designing for shared inode to begin with but one can easily create that situation. So atleast we need to have some defined behavior that what happens if inodes are shared across multiple processes in same cgroup and across cgroups. Database might have multiple threads/processes doing IO to single file. What if somebody moves some threads out to a separate cgroup etc. So I am not saying that is common configuration but we need to define system behavior properly if sharing does happen. Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html