On Mon, Jun 15, 2020 at 05:56:33PM -0700, Harshad Shirwadkar wrote: > This feature allows the user to control the alignment at which request > queue is allowed to split bios. Google CloudSQL's 16k user space > application expects that direct io writes aligned at 16k boundary in > the user-space are not split by kernel at non-16k boundaries. More > details about this feature can be found in CloudSQL's Cloud Next 2018 > presentation[1]. The underlying block device is capable of performing > 16k aligned writes atomically. Thus, this allows the user-space SQL > application to avoid double-writes (to protect against partial > failures) which are very costly provided that these writes are not > split at non-16k boundary by any underlying layers. > > We make use of Ext4's bigalloc feature to ensure that writes issued by > Ext4 are 16k aligned. But, 16K aligned data writes may get merged with > contiguous non-16k aligned Ext4 metadata writes. Such a write request > would be broken by the kernel only guaranteeing that the individually > split requests are physical block size aligned. > > We started observing a significant increase in 16k unaligned splits in > 5.4. Bisect points to commit 07173c3ec276cbb18dc0e0687d37d310e98a1480 > ("block: enable multipage bvecs"). This patch enables multipage bvecs > resulting in multiple 16k aligned writes issued by the user-space to > be merged into one big IO at first. Later, __blk_queue_split() splits > these IOs while trying to align individual split IOs to be physical > block size. > > Newly added split_alignment parameter is the alignment at which > requeust queue is allowed to split IO request. By default this > alignment is turned off and current behavior is unchanged. > Such alignment can be reached via q->limits.chunk_sectors, and you just need to expose it via sysfs and make it writable. Thanks, Ming