Re: [PATCH] block: avoid blk_bio_segment_split for small I/O operations

Keith Busch <kbusch@xxxxxxxxxx> · Tue, 5 Nov 2019 08:04:21 +0900

On Mon, Nov 04, 2019 at 02:58:41PM -0800, Bart Van Assche wrote:
> On 11/4/19 2:50 PM, Keith Busch wrote:
> > On Mon, Nov 04, 2019 at 01:13:53PM -0700, Jens Axboe wrote:
> > > > If the device advertises a chunk boundary and this small IO happens to
> > > > cross it, skipping the split is going to harm performance.
> > > 
> > > Does anyone do that, that isn't the first gen intel weirdness? Honest question,
> > > but always seemed to me that this spec addition was driven entirely by that
> > > one device.
> > 
> > There are at least 3 generations of Intel DC P-series that use this,
> > maybe more. I'm not sure if any other available vendor devices report
> > this feature, though.
> > > And if they do, do they align on non-4k?
> > 
> > All existing ones I'm aware of are 128k, so 4k aligned, but if the LBA
> > format is 512B, you could start a 4k IO at a 126k offset to straddle the
> > boundary. Hm, maybe we don't care about the split penalty in that case
> > since unaligned access is already going to be slower for other reasons ...
> 
> Aren't NVMe devices expected to set the NOIOB parameter to avoid that NVMe
> commands straddle boundaries that incur a performance penalty? From the NVMe
> spec: "Namespace Optimal IO Boundary (NOIOB): This field indicates the
> optimal IO boundary for this namespace. This field is specified in logical
> blocks. The host should construct read and write commands that do not cross
> the IO boundary to achieve optimal performance. A value of 0h indicates that
> no optimal IO boundary is reported."

Yes, for nvme, noiob is the feature we're talking about.

I was initially just thinking about performance, but there's other
cases Christoph mentioned where the host split is necessary.