On Wed, Sep 25, 2024 at 4:42 PM Max Gurtovoy <mgurtovoy@xxxxxxxxxx> wrote: > > Hi Daniel, > > On 25/09/2024 23:57, Daniel Verkamp wrote: >> >> On Wed, Sep 25, 2024 at 8:05 AM Max Gurtovoy <mgurtovoy@xxxxxxxxxx> wrote: >> >> This field is only valid when the VIRTIO_BLK_F_BLK_SIZE feature bit is >> set. >> >> The blk_size field represents the smallest addressable unit of data that >> can be read from or written to the device. It is always a power of two >> and typically ranges from 512 bytes to larger values such as 4 KB. >> >> I agree that this is what the blk_size field probably *should* have >> meant (matching other storage protocols like ATA, SCSI, NVMe, ...), >> but I believe rewriting the spec like this changes the meaning of the >> blk_size field in an incompatible way. > > I believe that this is more of a clarification than a re-write. > >> Previously, blk_size was described as the "optimal" sector size, so it >> seems clear that the driver is still allowed to send requests that are >> not aligned to blk_size ("awareness of the correct value can affect >> performance" but no mention of correctness). A device may need to do >> extra work to support this, of course, such as turning an unaligned >> write into a read-modify-write sequence if the underlying storage >> device doesn't support such an access directly, but such requests are >> still valid. It's also perfectly fine for a driver to ignore >> device-offered features like VIRTIO_BLK_F_BLK_SIZE and still operate >> correctly. > > A device should report its block size (logical) using the VIRTIO_BLK_F_BLK_SIZE feature. For legacy devices that don't report this feature, Linux assumes a 512-byte block size. Sure, I agree with both points here. > Drivers should not ignore this feature when it's offered by the device. It's crucial for correct operation IMO. This is a "should", not a "must", though - adding a new MUST is a breaking change. > The statement "driver is still allowed to send requests that are not aligned to blk_size" needs explicit clarification in the spec. It definitely would be better to have it written out explicitly, one way or the other, no disagreement there. > From my understanding of the spec, the driver is not allowed to do so (otherwise error will be returned or the behavior is undefined). This doesn't seem to follow from my reading of the spec. Otherwise, phrases like "optimal sector size" and "can affect performance" would not make sense (such a request would instead be described as illegal/returning an error). > In Linux, the reported blk_size is used to set logical_block_size, while physical_block_exp is used to calculate the physical_block_size. Sure, but this is describing the behavior of one driver implementation, not really relevant to the meaning of the spec. > Regarding the specification statement: "This does not affect the units used in the protocol (always 512 bytes)": > > This likely refers to how certain fields in the virtio-blk protocol are expressed, not the actual logical block size used for I/O operations. For example: capacity reporting, max_discard_sectors and Other fields that are expressed in 512 byte units. The way the fields are expressed in the virtio-blk protocol is what is relevant in the virtio spec. Even if a virtio-blk device is sending requests to a physical SCSI/ATA/... device behind the scenes, it's the job of that virtio-blk device implementation to handle the translation and whatever mismatches there may be between the protocols. Particularly, the "sector" value in a virtio_blk_req used for VIRTIO_BLK_T_IN/VIRTIO_BLK_T_OUT is always in 512-byte units, regardless of what blk_size the device reports, and regardless of what underlying physical media may be used.