On Mon, Mar 15, 2010 at 10:38:41AM -0400, Vivek Goyal wrote: > > > > > > bdi_thres ~= per_memory_cgroup_dirty * bdi_fraction > > > > > > But bdi_nr_reclaimable and bdi_nr_writeback stats are still global. > > > > > Why bdi_thresh of ROOT cgroup doesn't depend on global number ? > > > > I think in current implementation ROOT cgroup bdi_thres is always same > as global number. It is only for other child groups where it is different > from global number because of reduced dirytable_memory() limit. And we > don't seem to be allowing any control on root group. > > But I am wondering, what happens in following case. > > IIUC, with use_hierarhy=0, if I create two test groups test1 and test2, then > hierarchy looks as follows. > > root test1 test2 > > Now root group's DIRTYABLE is still system wide but test1 and test2's > dirtyable will be reduced based on RES_LIMIT in those groups. > > Conceptually, per cgroup dirty ratio is like fixing page cache share of > each group. So effectively we are saying that these limits apply to only > child group of root but not to root as such? Correct. In this implementation root cgroup means "outside all cgroups". I think this can be an acceptable behaviour since in general we don't set any limit to the root cgroup. > > > > So for the same number of dirty pages system wide on this bdi, we will be > > > triggering writeouts much more aggressively if somebody has created few > > > memory cgroups and tasks are running in those cgroups. > > > > > > I guess it might cause performance regressions in case of small file > > > writeouts because previously one could have written the file to cache and > > > be done with it but with this patch set, there are higher changes that > > > you will be throttled to write the pages back to disk. > > > > > > I guess we need two pieces to resolve this. > > > - BDI stats per cgroup. > > > - Writeback of inodes from same cgroup. > > > > > > I think BDI stats per cgroup will increase the complextiy. > > > > > Thank you for clarification. IIUC, dirty_limit implemanation shoul assume > > there is I/O resource controller, maybe usual users will use I/O resource > > controller and memcg at the same time. > > Then, my question is what happens when used with I/O resource controller ? > > > > Currently IO resource controller keep all the async IO queues in root > group so we can't measure exactly. But my guess is until and unless we > at least implement "writeback inodes from same cgroup" we will not see > increased flow of writes from one cgroup over other cgroup. Agreed. And I plan to look a the "writeback inodes per cgroup" feature soon. I'm sorry but I've some deadlines this week, so probably I'll start working on this in the next weekend. -Andrea -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>