[PATCH] zsmalloc: zsmalloc documentation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 04, 2015 at 04:56:10PM -0800, Andrew Morton wrote:
> On Thu, 5 Mar 2015 09:43:31 +0900 Minchan Kim <minchan@xxxxxxxxxx> wrote:
> 
> > Hello Andrew,
> > 
> > On Wed, Mar 04, 2015 at 02:02:02PM -0800, Andrew Morton wrote:
> > > On Wed,  4 Mar 2015 14:01:32 +0900 Minchan Kim <minchan@xxxxxxxxxx> wrote:
> > > 
> > > > +static int zs_stats_size_show(struct seq_file *s, void *v)
> > > > +{
> > > > +	int i;
> > > > +	struct zs_pool *pool = s->private;
> > > > +	struct size_class *class;
> > > > +	int objs_per_zspage;
> > > > +	unsigned long class_almost_full, class_almost_empty;
> > > > +	unsigned long obj_allocated, obj_used, pages_used;
> > > > +	unsigned long total_class_almost_full = 0, total_class_almost_empty = 0;
> > > > +	unsigned long total_objs = 0, total_used_objs = 0, total_pages = 0;
> > > > +
> > > > +	seq_printf(s, " %5s %5s %11s %12s %13s %10s %10s %16s\n",
> > > > +			"class", "size", "almost_full", "almost_empty",
> > > > +			"obj_allocated", "obj_used", "pages_used",
> > > > +			"pages_per_zspage");
> > > 
> > > Documentation?
> > 
> > It should been since [0f050d9, mm/zsmalloc: add statistics support].
> > Anyway, I will try it.
> > Where is right place to put only this statistics in Documentation?
> > 
> > Documentation/zsmalloc.txt?
> > Documentation/vm/zsmalloc.txt?
> > Documentation/blockdev/zram.txt?
> > Documentation/ABI/testing/sysfs-block-zram?
> 
> hm, this is debugfs so Documentation/ABI/testing/sysfs-block-zram isn't
> the right place.
> 
> akpm3:/usr/src/25> grep -rli zsmalloc Documentation 
> akpm3:/usr/src/25> 
> 
> lol.
> 
> Documentation/vm/zsmalloc.txt looks good.

Here it goes.

>From cb5ac24125c14467d1a5b6fbb92757d5517b0300 Mon Sep 17 00:00:00 2001
From: Minchan Kim <minchan@xxxxxxxxxx>
Date: Tue, 17 Mar 2015 10:02:07 +0900
Subject: [PATCH] zsmalloc: zsmalloc documentation

This patch creates zsmalloc doc which explains design concept
and stat information.

Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx>
---
 Documentation/vm/zsmalloc.txt | 70 +++++++++++++++++++++++++++++++++++++++++++
 MAINTAINERS                   |  1 +
 mm/zsmalloc.c                 | 29 ------------------
 3 files changed, 71 insertions(+), 29 deletions(-)
 create mode 100644 Documentation/vm/zsmalloc.txt

diff --git a/Documentation/vm/zsmalloc.txt b/Documentation/vm/zsmalloc.txt
new file mode 100644
index 000000000000..64ed63c4f69d
--- /dev/null
+++ b/Documentation/vm/zsmalloc.txt
@@ -0,0 +1,70 @@
+zsmalloc
+--------
+
+This allocator is designed for use with zram. Thus, the allocator is
+supposed to work well under low memory conditions. In particular, it
+never attempts higher order page allocation which is very likely to
+fail under memory pressure. On the other hand, if we just use single
+(0-order) pages, it would suffer from very high fragmentation --
+any object of size PAGE_SIZE/2 or larger would occupy an entire page.
+This was one of the major issues with its predecessor (xvmalloc).
+
+To overcome these issues, zsmalloc allocates a bunch of 0-order pages
+and links them together using various 'struct page' fields. These linked
+pages act as a single higher-order page i.e. an object can span 0-order
+page boundaries. The code refers to these linked pages as a single entity
+called zspage.
+
+For simplicity, zsmalloc can only allocate objects of size up to PAGE_SIZE
+since this satisfies the requirements of all its current users (in the
+worst case, page is incompressible and is thus stored "as-is" i.e. in
+uncompressed form). For allocation requests larger than this size, failure
+is returned (see zs_malloc).
+
+Additionally, zs_malloc() does not return a dereferenceable pointer.
+Instead, it returns an opaque handle (unsigned long) which encodes actual
+location of the allocated object. The reason for this indirection is that
+zsmalloc does not keep zspages permanently mapped since that would cause
+issues on 32-bit systems where the VA region for kernel space mappings
+is very small. So, before using the allocating memory, the object has to
+be mapped using zs_map_object() to get a usable pointer and subsequently
+unmapped using zs_unmap_object().
+
+stat
+----
+
+With CONFIG_ZSMALLOC_STAT, we could see zsmalloc internal information via
+/sys/kernel/debug/zsmalloc/<user name>. Here is a sample of stat output:
+
+# cat /sys/kernel/debug/zsmalloc/zram0/classes
+
+ class  size almost_full almost_empty obj_allocated   obj_used pages_used pages_per_zspage
+    ..
+    ..
+     9   176           0            1           186        129          8                4
+    10   192           1            0          2880       2872        135                3
+    11   208           0            1           819        795         42                2
+    12   224           0            1           219        159         12                4
+    ..
+    ..
+
+
+class: index
+size: object size zspage stores
+almost_empty: the number of ZS_ALMOST_EMPTY zspages(see below)
+almost_full: the number of ZS_ALMOST_FULL zspages(see below)
+obj_allocated: the number of objects allocated
+obj_used: the number of objects allocated to the user
+pages_used: the number of pages allocated for the class
+pages_per_zspage: the number of 0-order pages to make a zspage
+
+We assign a zspage to ZS_ALMOST_EMPTY fullness group when:
+      n <= N / f, where
+n = number of allocated objects
+N = total number of objects zspage can store
+f = fullness_threshold_frac(ie, 4 at the moment)
+
+Similarly, we assign zspage to:
+      ZS_ALMOST_FULL  when n > N / f
+      ZS_EMPTY        when n == 0
+      ZS_FULL         when n == N
diff --git a/MAINTAINERS b/MAINTAINERS
index b60d478770e9..560168c6530a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10945,6 +10945,7 @@ L:	linux-mm@xxxxxxxxx
 S:	Maintained
 F:	mm/zsmalloc.c
 F:	include/linux/zsmalloc.h
+F:	Documentation/vm/zsmalloc.txt
 
 ZSWAP COMPRESSED SWAP CACHING
 M:	Seth Jennings <sjennings@xxxxxxxxxxxxxx>
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 461243e14d3e..1833fc9e09cb 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -12,35 +12,6 @@
  */
 
 /*
- * This allocator is designed for use with zram. Thus, the allocator is
- * supposed to work well under low memory conditions. In particular, it
- * never attempts higher order page allocation which is very likely to
- * fail under memory pressure. On the other hand, if we just use single
- * (0-order) pages, it would suffer from very high fragmentation --
- * any object of size PAGE_SIZE/2 or larger would occupy an entire page.
- * This was one of the major issues with its predecessor (xvmalloc).
- *
- * To overcome these issues, zsmalloc allocates a bunch of 0-order pages
- * and links them together using various 'struct page' fields. These linked
- * pages act as a single higher-order page i.e. an object can span 0-order
- * page boundaries. The code refers to these linked pages as a single entity
- * called zspage.
- *
- * For simplicity, zsmalloc can only allocate objects of size up to PAGE_SIZE
- * since this satisfies the requirements of all its current users (in the
- * worst case, page is incompressible and is thus stored "as-is" i.e. in
- * uncompressed form). For allocation requests larger than this size, failure
- * is returned (see zs_malloc).
- *
- * Additionally, zs_malloc() does not return a dereferenceable pointer.
- * Instead, it returns an opaque handle (unsigned long) which encodes actual
- * location of the allocated object. The reason for this indirection is that
- * zsmalloc does not keep zspages permanently mapped since that would cause
- * issues on 32-bit systems where the VA region for kernel space mappings
- * is very small. So, before using the allocating memory, the object has to
- * be mapped using zs_map_object() to get a usable pointer and subsequently
- * unmapped using zs_unmap_object().
- *
  * Following is how we use various fields and flags of underlying
  * struct page(s) to form a zspage.
  *
-- 
1.9.1

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]