On Fri 22-05-15 17:13:40, Tejun Heo wrote: > Currently, balance_dirty_pages() always work on bdi->wb. This patch > updates it to work on the wb (bdi_writeback) matching memcg and blkcg > of the current task as that's what the inode is being dirtied against. > > balance_dirty_pages_ratelimited() now pins the current wb and passes > it to balance_dirty_pages(). > > As no filesystem has FS_CGROUP_WRITEBACK yet, this doesn't lead to > visible behavior differences. ... > void balance_dirty_pages_ratelimited(struct address_space *mapping) > { > - struct backing_dev_info *bdi = inode_to_bdi(mapping->host); > - struct bdi_writeback *wb = &bdi->wb; > + struct inode *inode = mapping->host; > + struct backing_dev_info *bdi = inode_to_bdi(inode); > + struct bdi_writeback *wb = NULL; > int ratelimit; > int *p; > > if (!bdi_cap_account_dirty(bdi)) > return; > > + if (inode_cgwb_enabled(inode)) > + wb = wb_get_create_current(bdi, GFP_KERNEL); > + if (!wb) > + wb = &bdi->wb; > + So this effectively adds a radix tree lookup (of wb belonging to memcg) for every set_page_dirty() call. That seems relatively costly to me. And all that just to check wb->dirty_exceeded. Cannot we just use inode_to_wb() instead? I understand results may be different if multiple memcgs share an inode and that's the reason why you use wb_get_create_current(), right? But for dirty_exceeded check it may be good enough? Honza > ratelimit = current->nr_dirtied_pause; > if (wb->dirty_exceeded) > ratelimit = min(ratelimit, 32 >> (PAGE_SHIFT - 10)); > @@ -1616,7 +1622,9 @@ void balance_dirty_pages_ratelimited(struct address_space *mapping) > preempt_enable(); > > if (unlikely(current->nr_dirtied >= ratelimit)) > - balance_dirty_pages(mapping, current->nr_dirtied); > + balance_dirty_pages(mapping, wb, current->nr_dirtied); > + > + wb_put(wb); > } > EXPORT_SYMBOL(balance_dirty_pages_ratelimited); > > -- > 2.4.0 > -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html