On Mon, 28 Mar 2011 19:31:08 +0900 KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote: > On Mon, 28 Mar 2011 11:48:20 +0200 > Michal Hocko <mhocko@xxxxxxx> wrote: > > > On Mon 28-03-11 18:11:27, KAMEZAWA Hiroyuki wrote: > > > On Mon, 28 Mar 2011 09:43:42 +0200 > > > Michal Hocko <mhocko@xxxxxxx> wrote: > > > > > > > On Mon 28-03-11 13:25:50, Daisuke Nishimura wrote: > > > > > From: Daisuke Nishimura <nishimura@xxxxxxxxxxxxxxxxx> > > [...] > > > > > +5.5 usage_in_bytes > > > > > + > > > > > +As described in 2.1, memory cgroup uses res_counter for tracking and limiting > > > > > +the memory usage. memory.usage_in_bytes shows the current res_counter usage for > > > > > +memory, and DOESN'T show a actual usage of RSS and Cache. It is usually bigger > > > > > +than the actual usage for a performance improvement reason. > > > > > > > > Isn't an explicit mention about caching charges better? > > > > > > > > > > It's difficult to distinguish which is spec. and which is implemnation details... > > > > Sure. At least commit log should contain the implementation details IMO, > > though. > > > > > > > > My one here ;) > > > == > > > 5.5 usage_in_bytes > > > > > > For efficiency, as other kernel components, memory cgroup uses some optimization to > > > avoid unnecessary cacheline false sharing. usage_in_bytes is affected by the > > > method and doesn't show 'exact' value of usage, it's an fuzz value for efficient > > > access. (Of course, when necessary, it's synchronized.) > > > In usual, the value (RSS+CACHE) in memory.stat shows more exact value. IOW, > > > > - In usual, the value (RSS+CACHE) in memory.stat shows more exact value. IOW, > > + (RSS+CACHE) value from memory.stat shows more exact value and should be used > > + by userspace. IOW, > > > > ? > > > > seems good. Nishimura-san, could you update ? > > Thanks, > -Kame > Thank you very much for your comments. This is the updated one. === From: Daisuke Nishimura <nishimura@xxxxxxxxxxxxxxxxx> Since 569b846d(memcg: coalesce uncharge during unmap/truncate), we do batched (delayed) uncharge at truncation/unmap. And since cdec2e42(memcg: coalesce charging via percpu storage), we have percpu cache for res_counter. These changes improved performance of memory cgroup very much, but made res_counter->usage usually have a bigger value than the actual value of memory usage. So, *.usage_in_bytes, which show res_counter->usage, are not desirable for precise values of memory(and swap) usage anymore. Instead of removing these files completely(because we cannot know res_counter->usage without them), this patch updates the meaning of those files. Signed-off-by: Daisuke Nishimura <nishimura@xxxxxxxxxxxxxxxxx> --- Documentation/cgroups/memory.txt | 15 +++++++++++++-- 1 files changed, 13 insertions(+), 2 deletions(-) diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt index 7781857..4f49d91 100644 --- a/Documentation/cgroups/memory.txt +++ b/Documentation/cgroups/memory.txt @@ -52,8 +52,10 @@ Brief summary of control files. tasks # attach a task(thread) and show list of threads cgroup.procs # show list of processes cgroup.event_control # an interface for event_fd() - memory.usage_in_bytes # show current memory(RSS+Cache) usage. - memory.memsw.usage_in_bytes # show current memory+Swap usage + memory.usage_in_bytes # show current res_counter usage for memory + (See 5.5 for details) + memory.memsw.usage_in_bytes # show current res_counter usage for memory+Swap + (See 5.5 for details) memory.limit_in_bytes # set/show limit of memory usage memory.memsw.limit_in_bytes # set/show limit of memory+Swap usage memory.failcnt # show the number of memory usage hits limits @@ -453,6 +455,15 @@ memory under it will be reclaimed. You can reset failcnt by writing 0 to failcnt file. # echo 0 > .../memory.failcnt +5.5 usage_in_bytes + +For efficiency, as other kernel components, memory cgroup uses some optimization +to avoid unnecessary cacheline false sharing. usage_in_bytes is affected by the +method and doesn't show 'exact' value of memory(and swap) usage, it's an fuzz +value for efficient access. (Of course, when necessary, it's synchronized.) +If you want to know more exact memory usage, you should use RSS+CACHE(+SWAP) +value in memory.stat(see 5.2). + 6. Hierarchy support The memory controller supports a deep hierarchy and hierarchical accounting. -- 1.7.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>