On Wed, Mar 30, 2011 at 7:01 PM, KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote: > > Hi, > > In this LSF/MM, we have some memcg topics in the 1st day. > > From schedule, > > 1. Memory cgroup : Where next ? 1hour (Balbir Singh/Kamezawa) > 2. Memcg Dirty Limit and writeback 30min(Greg Thelen) > 3. Memcg LRU management 30min (Ying Han, Michal Hocko) > 4. Page cgroup on a diet (Johannes Weiner) > > 2.5 hours. This seems long...or short ? ;) I think it is a good starting plan. > I'd like to sort out topics before going. Please fix if I don't catch enough. > > mentiont to 1. later... > > Main topics on 2. Memcg Dirty Limit and writeback ....is > > a) How to implement per-memcg dirty inode finding method (list) and > how flusher threads handle memcg. I have some very rough code implementing the ideas discussed in http://thread.gmane.org/gmane.linux.kernel.mm/59707 Unfortunately, I do not yet have good patches, but maybe an RFC series soon. I can provide update on the direction I am thinking. > b) Hot to interact with IO-Less dirty page reclaim. > IIUC, if memcg doesn't handle this correctly, OOM happens. The last posted memcg dirty writeback patches were based on -mm at the time, which did not have IO-less balance_dirty_pages. I have an approach which I _think_ will be compatible with IO-less balance_dirty_pages(), but I need to talk with some writeback guys to confirm. Seeing the Writeback talk Mon 9:30am should be very useful for me. > Greg, do we need to have a shared session with I/O guys ? > If needed, current schedule is O.K. ? We can contact any interested writeback guys to see if they want to attend memcg-writeback discussion. We might be able to defer this detail until Mon morning. > Main topics on 3. Memcg LRU management > > a) Isolation/Gurantee for memcg. > Current memcg doesn't have enough isolation when globarl reclaim runs. > .....Because it's designed not to affect global reclaim. > But from user's point of view, it's nonsense and we should have some hints > for isolate set of memory or implement a guarantee. > > One way to go is updating softlimit better. To do this, we should know what > is problem now. I'm sorry I can't prepare data on this until LSF/MM. > Another way is implementing a guarantee. But this will require some interaction > with page allocator and pgscan mechanism. This will be a big work. > > b) single LRU and per memcg zone->lru_lock. > I hear zone->lru_lock contention caused by memcg is a problem on Google servers. > Okay, please show data. (I've never seen it.) > Then, we need to discuss Pros. and Cons. of current design and need to consinder > how to improve it. I think Google and Michal have their own implementation. > > Current design of double-LRU is from the 1st inclusion of memcg to the kernel. > But I don't know that discussion was there. Balbir, could you explain the reason > of this design ? Then, we can go ahead, somewhere. > > > Main topics on 4. Page cgroup on diet is... > > a) page_cgroup is too big!, we need diet.... > I think Johannes removes -> page pointer already. Ok, what's the next to > be removed ? > > I guess the next candidate is ->lru which is related to 3-b). > > Main topics on 1.Memory control groups: where next? is.. > > To be honest, I just do bug fixes in these days. And hot topics are on above.. > I don't have concrete topics. What I can think of from recent linux-mm emails are... > > a) Kernel memory accounting. > b) Need some work with Cleancache ? > c) Should we provide a auto memory cgroup for file caches ? > (Then we can implement a file-cache-limit.) > d) Do we have a problem with current OOM-disable+notifier design ? > e) ROOT cgroup should have a limit/softlimit, again ? > f) vm_overcommit_memory should be supproted with memcg ? > (I remember there was a trial. But I think it should be done in other cgroup > as vmemory cgroup.) > ... > > I think > a) discussing about this is too early. There is no patch. > I think we'll just waste time. > > b) enable/disable cleancache per memcg or some share/limit ?? > But we can discuss this kind of things after cleancache is in production use... > > c) AFAIK, some other OSs have this kind of feature, a box for file-cache. > Because file-cache is a shared object between all cgroups, it's difficult > to handle. It may be better to have a auto cgroup for file caches and add knobs > for memcg. > > d) I think it works well. > > e) It seems Michal wants this for lazy users. Hmm, should we have a knob ? > It's helpful that some guy have a performance number on the latest kernel > with and without memcg (in limitless case). > IIUC, with THP enabled as 'always', the number of page fault dramatically reduced and > memcg's accounting cost gets down... > > f) I think someone mention about this... > > Maybe c) and d) _can_ be a topic but seems not very important. > > So, for this slot, I'd like to discuss > > I) Softlimit/Isolation (was 3-A) for 1hour > If we have extra time, kernel memory accounting or file-cache handling > will be good. > > II) Dirty page handling. (for 30min) > Maybe we'll discuss about per-memcg inode queueing issue. > > III) Discussing the current and future design of LRU.(for 30+min) > > IV) Diet of page_cgroup (for 30-min) > Maybe this can be combined with III. > > Thanks, > -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href