Avi Kivity wrote:
Well, no one is talking about 64KB granularity for in-core files. Like
you noticed, Windows uses the mmu page size. We could keep doing that,
and still have 16KB+ sector sizes. It just means a RMW if you don't
happen to have the adjoining clean pages in cache.
Sure, on a rotating disk that's a disaster, but we're talking SSD here,
so while you're doubling your access time, you're doubling a fairly
small quantity. The controller would do the same if it exposed smaller
sectors, so there's no huge loss.
We still lose on disk storage efficiency, but I'm guessing that a modern
tree with some object files with debug information and a .git directory
it won't be such a great hit. For more mainstream uses, it would be
negligible.
Speaking of RMW... in one sense, we have to deal with RMW anyway.
Upcoming ATA hard drives will be configured with a normal 512b sector
API interface, but underlying physical sector size is 1k or 4k.
The disk performs the RMW for us, but we must be aware of physical
sector size in order to determine proper alignment of on-disk data, to
minimize RMW cycles.
At the moment, it seems like most of the effort to get these ATA devices
to perform efficiently is in getting partition / RAID stripe offsets set
up properly.
So perhaps for NVMHCI we could
(a) hardcode NVM sector size maximum at 4k
(b) do RMW in the driver for sector size >4k, and
(c) export information indicating the true sector size, in a manner
similar to how the ATA driver passes that info to userland partitioning
tools.
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html