Re: [PATCH 10/23] MM: submit multipage write for SWP_FS_OPS swap-space

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 24, 2022 at 02:48:32PM +1100, NeilBrown wrote:
> swap_writepage() is given one page at a time, but may be called repeatedly
> in succession.
> For block-device swapspace, the blk_plug functionality allows the
> multiple pages to be combined together at lower layers.
> That cannot be used for SWP_FS_OPS as blk_plug may not exist - it is
> only active when CONFIG_BLOCK=y.  Consequently all swap reads over NFS
> are single page reads.
> 
> With this patch we pass a pointer-to-pointer via the wbc.
> swap_writepage can store state between calls - much like the pointer
> passed explicitly to swap_readpage.  After calling swap_writepage() some
> number of times, the state will be passed to swap_write_unplug() which
> can submit the combined request.
> 
> Signed-off-by: NeilBrown <neilb@xxxxxxx>
> ---
>  include/linux/writeback.h |    7 +++
>  mm/page_io.c              |  103 +++++++++++++++++++++++++++++----------------
>  mm/swap.h                 |    1 
>  mm/vmscan.c               |    9 +++-
>  4 files changed, 82 insertions(+), 38 deletions(-)
> 
> diff --git a/include/linux/writeback.h b/include/linux/writeback.h
> index fec248ab1fec..6dcaa0639c0d 100644
> --- a/include/linux/writeback.h
> +++ b/include/linux/writeback.h
> @@ -80,6 +80,13 @@ struct writeback_control {
>  
>  	unsigned punt_to_cgroup:1;	/* cgrp punting, see __REQ_CGROUP_PUNT */
>  
> +	/* To enable batching of swap writes to non-block-device backends,
> +	 * "plug" can be set point to a 'struct swap_iocb *'.  When all swap
> +	 * writes have been submitted, if with swap_iocb is not NULL,
> +	 * swap_write_unplug() should be called.
> +	 */
> +	struct swap_iocb **plug;

Mayb plug isn't really the best name for something swap-specific in this
generic structure?

Also the above does not fit the normal kernel comment style with an
otherwise empty

	/*

line.

> +	for (p = 0; p < sio->pages; p++) {
> +		struct page *page = sio->bvec[p].bv_page;
> +
> +		if (ret != 0 && ret != PAGE_SIZE * sio->pages) {
> +			/*
> +			 * In the case of swap-over-nfs, this can be a
> +			 * temporary failure if the system has limited
> +			 * memory for allocating transmit buffers.
> +			 * Mark the page dirty and avoid
> +			 * folio_rotate_reclaimable but rate-limit the
> +			 * messages but do not flag PageError like
> +			 * the normal direct-to-bio case as it could
> +			 * be temporary.
> +			 */
> +			set_page_dirty(page);
> +			ClearPageReclaim(page);
> +			pr_err_ratelimited("Write error %ld on dio swapfile (%llu)\n",
> +					   ret, page_file_offset(page));
> +		} else
> +			count_vm_event(PSWPOUT);

I'd rather check for the error condition ones and have separate loops
forthe success vs error cases instead of checking the condition again
and again.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@xxxxxx>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux