Re: [PATCH] block: add split_alignment for request queue

Ming Lei <ming.lei@xxxxxxxxxx> · Tue, 16 Jun 2020 10:40:28 +0800

On Mon, Jun 15, 2020 at 05:56:33PM -0700, Harshad Shirwadkar wrote:
> This feature allows the user to control the alignment at which request
> queue is allowed to split bios. Google CloudSQL's 16k user space
> application expects that direct io writes aligned at 16k boundary in
> the user-space are not split by kernel at non-16k boundaries. More
> details about this feature can be found in CloudSQL's Cloud Next 2018
> presentation[1]. The underlying block device is capable of performing
> 16k aligned writes atomically. Thus, this allows the user-space SQL
> application to avoid double-writes (to protect against partial
> failures) which are very costly provided that these writes are not
> split at non-16k boundary by any underlying layers.
> 
> We make use of Ext4's bigalloc feature to ensure that writes issued by
> Ext4 are 16k aligned. But, 16K aligned data writes may get merged with
> contiguous non-16k aligned Ext4 metadata writes. Such a write request
> would be broken by the kernel only guaranteeing that the individually
> split requests are physical block size aligned.
> 
> We started observing a significant increase in 16k unaligned splits in
> 5.4. Bisect points to commit 07173c3ec276cbb18dc0e0687d37d310e98a1480
> ("block: enable multipage bvecs"). This patch enables multipage bvecs
> resulting in multiple 16k aligned writes issued by the user-space to
> be merged into one big IO at first. Later, __blk_queue_split() splits
> these IOs while trying to align individual split IOs to be physical
> block size.
> 
> Newly added split_alignment parameter is the alignment at which
> requeust queue is allowed to split IO request. By default this
> alignment is turned off and current behavior is unchanged.
> 

Such alignment can be reached via q->limits.chunk_sectors, and you
just need to expose it via sysfs and make it writable.

Thanks,
Ming