On 2021/08/17 2:10, Martin K. Petersen wrote: > > Hi Damien! > >> Single LUN multi-actuator hard-disks are cappable to seek and execute >> multiple commands in parallel. This capability is exposed to the host >> using the Concurrent Positioning Ranges VPD page (SCSI) and Log (ATA). >> Each positioning range describes the contiguous set of LBAs that an >> actuator serves. > > I am not a big fan of the Concurrent Positioning Range terminology since > it is very specific to the implementation of multi-actuator disk drives. > With other types of media, "positioning" doesn't any sense. It is > unfortunate that CPR is the term that ended up in a spec that covers a > wide variety of devices and media types. > > I also think that the "concurrent positioning" emphasizes the > performance aspect but not so much the fault domain which in many ways > is the more interesting part. > > The purpose of exposing this information to the filesystems must be to > encourage them to use it. And therefore I think it is important that the > semantics and information conveyed is applicable outside of the > multi-actuator use case. It would be easy to expose this kind of > information for concatenated LVM devices, etc. > > Anyway. I don't really have any objections to the series from an > implementation perspective. I do think "cpr" as you used in patch #2 is > a better name than "crange". But again, I wish we could come up with a > more accurate and less disk actuator centric terminology for the block > layer plumbing. > > I would have voted for "fault_domain" but that ignores the performance > aspect. "independent_block_range", maybe? Why is naming always so hard? > :( I did struggle with the naming too and crange was the best I could come up with given the specs wording. With the single LUN approach, the fault domain does not really change from a regular device. The typical use in case of bad heads would be to replace the drive or reformat it at lower capacity with head depop. That could be avoided with dm-linear on top (one DM per actuator) though. As for the independent_block_range idea, I thought of that too, but the problem is that the tags are still shared between the 2 actuators, so the ranges are not really independent at all. One can starve the other depending on the host workload without FS and/or IO scheduler optimization distributing the commands between the actuators. The above point led me to this informational only implementation. Without optimization, we get the same as today. No changes in performance and use. Better IOPS is gain for lucky workloads (typically random ones). Going forward, more reliable IOPS & throughput gains are possible with some additional changes. -- Damien Le Moal Western Digital Research