[to-be-updated] lib-lzo-tidy-up-ifdefs.patch removed from -mm tree

akpm@xxxxxxxxxxxxxxxxxxxx · Fri, 07 Dec 2018 11:58:00 -0800

The patch titled
     Subject: lib/lzo: tidy-up ifdefs
has been removed from the -mm tree.  Its filename was
     lib-lzo-tidy-up-ifdefs.patch

This patch was dropped because an updated version will be merged

------------------------------------------------------
From: Dave Rodgman <dave.rodgman@xxxxxxx>
Subject: lib/lzo: tidy-up ifdefs

Patch series "lib/lzo: performance improvements", v4.

This series introduces performance improvements for lzo.

The previous version of this patchset is here:
https://lkml.org/lkml/2018/11/21/625

This version tidies up the ifdefs as per Christoph's comment (although
certainly more could be done, this is at least a bit more consistent
with normal kernel coding style).

On 23/11/2018 2:12 am, Sergey Senozhatsky wrote:

>> The graph below shows the weighted round-trip throughput of lzo, lz4 and
>> lzo-rle, for randomly generated 4k chunks of data with varying levels of
>> entropy. (To calculate weighted round-trip throughput, compression performance
>> is emphasised to reflect the fact that zram does around 2.25x more compression
>> than decompression.
> 
> Right. The number is data dependent. Not all swapped out pages can be
> compressed; compressed pages that end up being >= zs_huge_class_size() are
> considered incompressible and stored as it.
> 
> I'd say that on my setups around 50-60% of pages are incompressible.

So, just to give a bit more detail: the test setup was a Samsung
Chromebook Pro, cycling through 80 tabs in Chrome. With lzo-rle, only
5% of pages increased in size, and 90% of pages compress to 75% of
original size (or better). Mean compression ratio was 41%. Importantly
for lzo-rle, there are a lot of low-entropy pages where it can do well:
in total about 20% of the data is zeros forming part of a run of 4 or
more bytes.

As a quick summary of the impact of these patches on bigger chunks of
data, I've compared the performance of four different variants of lzo
on two large (~40 MB) files. The numbers show round-trip throughput
in MB/s:

Variant         | Low-entropy | High-entropy
Current lzo     |  242        | 157
Arm opts        |  290        | 159
RLE             |  876        | 151
Arm opts + RLE  | 1150        | 181

So both the Arm optimisations (8,16-byte copy & CTZ patches), and the
RLE implementation make a significant contribution to the overall
performance uplift.


This patch (of 8):

Modify the ifdefs in lzodefs.h to be more consistent with normal kernel
macros (e.g., change __aarch64__ to CONFIG_ARM64).

Link: http://lkml.kernel.org/r/20181127161913.23863-2-dave.rodgman@xxxxxxx
Signed-off-by: Dave Rodgman <dave.rodgman@xxxxxxx>
Cc: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
Cc: David S. Miller <davem@xxxxxxxxxxxxx>
Cc: Nitin Gupta <nitingupta910@xxxxxxxxx>
Cc: Richard Purdie <rpurdie@xxxxxxxxxxxxxx>
Cc: Markus F.X.J. Oberhumer <markus@xxxxxxxxxxxxx>
Cc: Minchan Kim <minchan@xxxxxxxxxx>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@xxxxxxxxx>
Cc: Sonny Rao <sonnyrao@xxxxxxxxxx>
Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
Cc: Matt Sealey <matt.sealey@xxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 lib/lzo/lzodefs.h |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/lib/lzo/lzodefs.h~lib-lzo-tidy-up-ifdefs
+++ a/lib/lzo/lzodefs.h
@@ -15,7 +15,7 @@
 
 #define COPY4(dst, src)	\
 		put_unaligned(get_unaligned((const u32 *)(src)), (u32 *)(dst))
-#if defined(__x86_64__)
+#if defined(CONFIG_X86_64)
 #define COPY8(dst, src)	\
 		put_unaligned(get_unaligned((const u64 *)(src)), (u64 *)(dst))
 #else
@@ -25,12 +25,12 @@
 
 #if defined(__BIG_ENDIAN) && defined(__LITTLE_ENDIAN)
 #error "conflicting endian definitions"
-#elif defined(__x86_64__)
+#elif defined(CONFIG_X86_64)
 #define LZO_USE_CTZ64	1
 #define LZO_USE_CTZ32	1
-#elif defined(__i386__) || defined(__powerpc__)
+#elif defined(CONFIG_X86) || defined(CONFIG_PPC)
 #define LZO_USE_CTZ32	1
-#elif defined(__arm__) && (__LINUX_ARM_ARCH__ >= 5)
+#elif defined(CONFIG_ARM) && (__LINUX_ARM_ARCH__ >= 5)
 #define LZO_USE_CTZ32	1
 #endif
 
_

Patches currently in -mm which might be from dave.rodgman@xxxxxxx are

lib-lzo-implement-run-length-encoding.patch
lib-lzo-implement-run-length-encoding-v4.patch
lib-lzo-separate-lzo-rle-from-lzo.patch
lib-lzo-separate-lzo-rle-from-lzo-v4.patch
zram-default-to-lzo-rle-instead-of-lzo.patch