Re: [LSF/MM/BPF TOPIC] Cloud storage optimizations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 28, 2023 at 10:52:15PM -0500, Theodore Ts'o wrote:
> Emulated block devices offered by cloud VM’s can provide functionality
> to guest kernels and applications that traditionally have not been
> available to users of consumer-grade HDD and SSD’s.  For example,
> today it’s possible to create a block device in Google’s Persistent
> Disk with a 16k physical sector size, which promises that aligned 16k
> writes will be atomically.  With NVMe, it is possible for a storage
> device to promise this without requiring read-modify-write updates for
> sub-16k writes. 

I'm not sure it does. NVMe spec doesn't say AWUN writes are never a RMW
operation. NVMe suggests aligning to NPWA is the best way to avoid RMW, but
doesn't guarantee that, nor does it require this limit aligns to atomic
boundaries. NVMe provides a lot of hints, but stops short of promises. Vendors
can promise whatever they want, but that's outside spec.

> All that is necessary are some changes in the block
> layer so that the kernel does not inadvertently tear a write request
> when splitting a bio because it is too large (perhaps because it got
> merged with some other request, and then it gets split at an
> inconvenient boundary).

All the limits needed to optimally split on phyiscal boundaries exist, so I
hope we're using them correctly via get_max_io_size().

That said, I was hoping you were going to suggest supporting 16k logical block
sizes. Not a problem on some arch's, but still problematic when PAGE_SIZE is
4k. :)




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux