Re: [PATCH] btrfs: zlib: do not do unnecessary page copying for compression

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Qu,

I remember implementing btrfs zlib changes related to s390 dfltcc compression support a while ago:
https://lwn.net/Articles/808809/

The workspace buffer size was indeed enlarged for performance reasons.

Please see my comments below.

On 27.05.2024 11:24, Qu Wenruo wrote:
> [BUG]
> In function zlib_compress_folios(), we handle the input by:
> 
> - If there are multiple pages left
>   We copy the page content into workspace->buf, and use workspace->buf
>   as input for compression.
> 
>   But on x86_64 (which doesn't support dfltcc), that buffer size is just
>   one page, so we're wasting our CPU time copying the page for no
>   benefit.
> 
> - If there is only one page left
>   We use the mapped page address as input for compression.
> 
> The problem is, this means we will copy the whole input range except the
> last page (can be as large as 124K), without much obvious benefit.
> 
> Meanwhile the cost is pretty obvious.

Actually, the behavior for kernels w/o dfltcc support (currently available on s390
only) should not be affected. 
We copy input pages to the workspace->buf only if the buffer size is larger than 1 page.
At least it worked this way after my original btrfs zlib patch:
https://lwn.net/ml/linux-kernel/20200108105103.29028-1-zaslonko@xxxxxxxxxxxxx/

Has this behavior somehow changed after your page->folio conversion performed for btrfs? 
https://lore.kernel.org/all/cover.1706521511.git.wqu@xxxxxxxx/

Am I missing something?

> 
> [POSSIBLE REASON]
> The possible reason may be related to the support of S390 hardware zlib
> decompression acceleration.
> 
> As we allocate 4 pages (4 * 4K) as workspace input buffer just for s390.
> 
> [FIX]
> I checked the dfltcc code, there seems to be no requirement on the
> input buffer size.
> The function dfltcc_can_deflate() only checks:
> 
> - If the compression settings are supported
>   Only level/w_bits/strategy/level_mask is checked.
> 
> - If the hardware supports
> 
> No mention at all on the input buffer size, thus I believe there is no
> need to waste time doing the page copying.
> 
> Maybe the hardware acceleration is so good for s390 that they can offset
> the page copying cost, but it's definitely a penalty for non-s390
> systems.
> 
> So fix the problem by:
> 
> - Use the same buffer size
>   No matter if dfltcc support is enabled or not
> 
> - Always use page address as input
> 
> Cc: linux-s390@xxxxxxxxxxxxxxx
> Signed-off-by: Qu Wenruo <wqu@xxxxxxxx>
> ---
>  fs/btrfs/zlib.c | 67 +++++++++++--------------------------------------
>  1 file changed, 14 insertions(+), 53 deletions(-)
> 
> diff --git a/fs/btrfs/zlib.c b/fs/btrfs/zlib.c
> index d9e5c88a0f85..9c88a841a060 100644
> --- a/fs/btrfs/zlib.c
> +++ b/fs/btrfs/zlib.c
> @@ -65,21 +65,8 @@ struct list_head *zlib_alloc_workspace(unsigned int level)
>  			zlib_inflate_workspacesize());
>  	workspace->strm.workspace = kvzalloc(workspacesize, GFP_KERNEL | __GFP_NOWARN);
>  	workspace->level = level;
> -	workspace->buf = NULL;
> -	/*
> -	 * In case of s390 zlib hardware support, allocate lager workspace
> -	 * buffer. If allocator fails, fall back to a single page buffer.
> -	 */
> -	if (zlib_deflate_dfltcc_enabled()) {
> -		workspace->buf = kmalloc(ZLIB_DFLTCC_BUF_SIZE,
> -					 __GFP_NOMEMALLOC | __GFP_NORETRY |
> -					 __GFP_NOWARN | GFP_NOIO);
> -		workspace->buf_size = ZLIB_DFLTCC_BUF_SIZE;
> -	}
> -	if (!workspace->buf) {
> -		workspace->buf = kmalloc(PAGE_SIZE, GFP_KERNEL);
> -		workspace->buf_size = PAGE_SIZE;
> -	}
> +	workspace->buf = kmalloc(PAGE_SIZE, GFP_KERNEL);
> +	workspace->buf_size = PAGE_SIZE;
>  	if (!workspace->strm.workspace || !workspace->buf)
>  		goto fail;
>  
> @@ -103,7 +90,6 @@ int zlib_compress_folios(struct list_head *ws, struct address_space *mapping,
>  	struct folio *in_folio = NULL;
>  	struct folio *out_folio = NULL;
>  	unsigned long bytes_left;
> -	unsigned int in_buf_folios;
>  	unsigned long len = *total_out;
>  	unsigned long nr_dest_folios = *out_folios;
>  	const unsigned long max_out = nr_dest_folios * PAGE_SIZE;
> @@ -130,7 +116,6 @@ int zlib_compress_folios(struct list_head *ws, struct address_space *mapping,
>  	folios[0] = out_folio;
>  	nr_folios = 1;
>  
> -	workspace->strm.next_in = workspace->buf;
>  	workspace->strm.avail_in = 0;
>  	workspace->strm.next_out = cfolio_out;
>  	workspace->strm.avail_out = PAGE_SIZE;
> @@ -142,43 +127,19 @@ int zlib_compress_folios(struct list_head *ws, struct address_space *mapping,
>  		 */
>  		if (workspace->strm.avail_in == 0) {
>  			bytes_left = len - workspace->strm.total_in;
> -			in_buf_folios = min(DIV_ROUND_UP(bytes_left, PAGE_SIZE),
> -					    workspace->buf_size / PAGE_SIZE);

	doesn't this always set *in_buf_folios* to 1 in case no dfltcc support (single page workspace buffer)?

> -			if (in_buf_folios > 1) {
> -				int i;
> -
> -				for (i = 0; i < in_buf_folios; i++) {
> -					if (data_in) {
> -						kunmap_local(data_in);
> -						folio_put(in_folio);
> -						data_in = NULL;
> -					}
> -					ret = btrfs_compress_filemap_get_folio(mapping,
> -							start, &in_folio);
> -					if (ret < 0)
> -						goto out;
> -					data_in = kmap_local_folio(in_folio, 0);
> -					copy_page(workspace->buf + i * PAGE_SIZE,
> -						  data_in);
> -					start += PAGE_SIZE;
> -				}
> -				workspace->strm.next_in = workspace->buf;
> -			} else {
> -				if (data_in) {
> -					kunmap_local(data_in);
> -					folio_put(in_folio);
> -					data_in = NULL;
> -				}
> -				ret = btrfs_compress_filemap_get_folio(mapping,
> -						start, &in_folio);
> -				if (ret < 0)
> -					goto out;
> -				data_in = kmap_local_folio(in_folio, 0);
> -				start += PAGE_SIZE;
> -				workspace->strm.next_in = data_in;
> +			if (data_in) {
> +				kunmap_local(data_in);
> +				folio_put(in_folio);
> +				data_in = NULL;
>  			}
> -			workspace->strm.avail_in = min(bytes_left,
> -						       (unsigned long) workspace->buf_size);
> +			ret = btrfs_compress_filemap_get_folio(mapping,
> +					start, &in_folio);
> +			if (ret < 0)
> +				goto out;
> +			data_in = kmap_local_folio(in_folio, 0);
> +			start += PAGE_SIZE;
> +			workspace->strm.next_in = data_in;
> +			workspace->strm.avail_in = min(bytes_left, PAGE_SIZE);
>  		}
>  
>  		ret = zlib_deflate(&workspace->strm, Z_SYNC_FLUSH);

Thanks,
Mikhail




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Kernel Development]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Info]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Linux Media]     [Device Mapper]

  Powered by Linux