[PATCH 0/4] zswap: Optimize compressed pool memory utilization

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 17 August 2016 at 18:08, Pekka Enberg <penberg@xxxxxxxxxx> wrote:
> On Wed, Aug 17, 2016 at 1:03 PM, Srividya Desireddy
> <srividya.dr@xxxxxxxxxxx> wrote:
>> This series of patches optimize the memory utilized by zswap for storing
>> the swapped out pages.
>>
>> Zswap is a cache which compresses the pages that are being swapped out
>> and stores them into a dynamically allocated RAM-based memory pool.
>> Experiments have shown that around 10-15% of pages stored in zswap are
>> duplicates which results in 10-12% more RAM required to store these
>> duplicate compressed pages. Around 10-20% of pages stored in zswap
>> are zero-filled pages, but these pages are handled as normal pages by
>> compressing and allocating memory in the pool.
>>
>> The following patch-set optimizes memory utilized by zswap by avoiding the
>> storage of duplicate pages and zero-filled pages in zswap compressed memory
>> pool.
>>
>> Patch 1/4: zswap: Share zpool memory of duplicate pages
>> This patch shares compressed pool memory of the duplicate pages. When a new
>> page is requested for swap-out to zswap; search for an identical page in
>> the pages already stored in zswap. If an identical page is found then share
>> the compressed page data of the identical page with the new page. This
>> avoids allocation of memory in the compressed pool for a duplicate page.
>> This feature is tested on devices with 1GB, 2GB and 3GB RAM by executing
>> performance test at low memory conditions. Around 15-20% of the pages
>> swapped are duplicate of the pages existing in zswap, resulting in 15%
>> saving of zswap memory pool when compared to the baseline version.
>>
>> Test Parameters         Baseline    With patch  Improvement
>> Total RAM                   955MB       955MB
>> Available RAM             254MB       269MB       15MB
>> Avg. App entry time     2.469sec    2.207sec    7%
>> Avg. App close time     1.151sec    1.085sec    6%
>> Apps launched in 1sec   5             12             7
>>
>> There is little overhead in zswap store function due to the search
>> operation for finding duplicate pages. However, if duplicate page is
>> found it saves the compression and allocation time of the page. The average
>> overhead per zswap_frontswap_store() function call in the experimental
>> device is 9us. There is no overhead in case of zswap_frontswap_load()
>> operation.
>>
>> Patch 2/4: zswap: Enable/disable sharing of duplicate pages at runtime
>> This patch adds a module parameter to enable or disable the sharing of
>> duplicate zswap pages at runtime.
>>
>> Patch 3/4: zswap: Zero-filled pages handling
>> This patch checks if a page to be stored in zswap is a zero-filled page
>> (i.e. contents of the page are all zeros). If such page is found,
>> compression and allocation of memory for the compressed page is avoided
>> and instead the page is just marked as zero-filled page.
>> Although, compressed size of a zero-filled page using LZO compressor is
>> very less (52 bytes including zswap_header), this patch saves compression
>> and allocation time during store operation and decompression time during
>> zswap load operation for zero-filled pages. Experiments have shown that
>> around 10-20% of pages stored in zswap are zero-filled.
>
> Aren't zero-filled pages already handled by patch 1/4 as their
> contents match? So the overall memory saving is 52 bytes?
>
> - Pekka

Thanks for the quick reply.

Zero-filled pages can also be handled by patch 1/4. It performs
searching of a duplicate page among existing stored pages in zswap.
Its been observed that average search time to identify duplicate zero
filled pages(using patch 1/4) is almost thrice compared to checking
all pages for zero-filled. 

Also, in case of patch 1/4, the zswap_frontswap_load() operation requires
the compressed zero-filled page to be decompressed. zswap_frontswap_load()
function in patch 3/4 just fills the page with zeros while loading a
zero-filled page and is faster than decompression.

- Srividya

[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]