On 6/21/18 9:34 PM, Eric Sandeen wrote: > On 6/21/18 6:16 PM, Eric Sandeen wrote: >>> Ah, so it came from the hardware? In that case, we probably >>> shouldn't zero sunit when blkid reports this whacky case. i.e. I >>> think we should set swidth = sunit so that we retain allocation >>> alignment to the minimum IO size the device specified. >> Hrmph. >> >> yeah, I'd really like to see >> >> # blockdev --getiomin --getioopt --getss --getpbsz >> >> for all the devices in the stack, I guess... > > Ok, so Jeff shared the lsblk output with me; the underlying > device and the mpath device are both: > > MIN-IO OPT-IO PHY-SEC LOG-SEC > 8192 0 512 512 > > so logical/physical 512, with minimum io 8192 and optimal io 0. > On further inspection Jeff says this may be because an optimal > size is reported as greater than the maximum size, so we get 0 > due to the inconsistency. For this device, the user reported the following sg_vpd -a output: Optimal transfer length granularity: 16 blocks -- This is what gets exported as minimum_io_size Maximum transfer length: 32768 blocks Optimal transfer length: 65535 blocks The SCSI layer checks to see if the optimal blocks value is larger than what the device reports as its max and whether it's larger than 65535 or smaller than page size, and refuses to set the optimal io size so it defaults to 0. Given the values above, that has to be what's happening here. > Ok, well, I guess I see your point that we should probably set > sunit=swidth=8192 here. I was thinking that conflicted with > how we process it on the commandline but no, it doesn't; if you're > going to manually specify you can't set one to 0 and not the other, > but we have to make the best of what the hardware tells us. > > So we could do: > > /* > * Some odd hardware sets minimum IO but not optimal; assume > * minimal is the smallest preferred IO size, and set optimal > * to the same value, i.e. a stripe of 1 disk. > */ > if (*sunit && *swidth == 0) > *swidth = *sunit; > > But it's clearly buggy hardware. How are we to know which values > are accurate and which are not? It still may be better to just set > to no stripe geometry if we've gotten nonsense from libblockdev, > if the device wants better performance it needs to have rational > firmware... trying to 2nd guess bad values is just a crapshoot. > I guess we could issue a warning ... I think the sane defaults + warning is the way to go here too. -Jeff -- Jeff Mahoney SUSE Labs
Attachment:
signature.asc
Description: OpenPGP digital signature