Re: [PATCH v3 0/4] Initial support for multi-actuator HDDs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2021/08/06 17:36, Hannes Reinecke wrote:
> On 8/6/21 6:05 AM, Damien Le Moal wrote:
>> On 2021/08/06 12:42, Martin K. Petersen wrote:
>>>
>>> Damien,
>>>
>>>> Single LUN multi-actuator hard-disks are cappable to seek and execute
>>>> multiple commands in parallel. This capability is exposed to the host
>>>> using the Concurrent Positioning Ranges VPD page (SCSI) and Log (ATA).
>>>> Each positioning range describes the contiguous set of LBAs that an
>>>> actuator serves.
>>>
>>> I have to say that I prefer the multi-LUN model.
>>
>> It is certainly easier: nothing to do :)
>> SATA, as usual, makes things harder...
>>
>>>
>>>> The first patch adds the block layer plumbing to expose concurrent
>>>> sector ranges of the device through sysfs as a sub-directory of the
>>>> device sysfs queue directory.
>>>
>>> So how do you envision this range reporting should work when putting
>>> DM/MD on top of a multi-actuator disk?
>>
>> The ranges are attached to the device request queue. So the DM/MD target driver
>> can use that information from the underlying devices for whatever possible
>> optimization. For the logical device exposed by the target driver, the ranges
>> are not limits so they are not inherited. As is, right now, DM target devices
>> will not show any range information for the logical devices they create, even if
>> the underlying devices have multiple ranges.
>>
>> The DM/MD target driver is free to set any range information pertinent to the
>> target. E.g. dm-liear could set the range information corresponding to sector
>> chunks from different devices used to build the dm-linear device.
>>
> And indeed, that would be the easiest consumer.
> One 'just' needs to have a simple script converting the sysfs ranges
> into the corresponding dm-linear table definitions, and create one DM
> device for each range.
> That would simulate the multi-LUN approach.
> Not sure if that would warrant a 'real' DM target, seeing that it's
> fully scriptable.
> 
>>> And even without multi-actuator drives, how would you express concurrent
>>> ranges on a DM/MD device sitting on top of a several single-actuator
>>> devices?
>>
>> Similar comment as above: it is up to the DM/MD target driver to decide if range
>> information can be useful. For dm-linear, there are obvious cases where it is.
>> Ex: 2 single actuator drives concatenated together can generate 2 ranges
>> similarly to a real split-actuator disk. Expressing the chunks of a dm-linear
>> setup as ranges may not always be possible though, that is, if we keep the
>> assumption that a range is independent from others in terms of command
>> execution. Ex: a dm-linear setup that shuffles a drive LBA mapping (high to low
>> and low to high) has no business showing sector ranges.
>>
>>> While I appreciate that it is easy to just export what the hardware
>>> reports in sysfs, I also think we should consider how filesystems would
>>> use that information. And how things would work outside of the simple
>>> fs-on-top-of-multi-actuator-drive case.
>>
>> Without any change anywhere in existing code (kernel and applications using raw
>> disk accesses), things will just work as is. The multi/split actuator drive will
>> behave as a single actuator drive, even for commands spanning range boundaries.
>> Your guess on potential IOPS gains is as good as mine in this case. Performance
>> will totally depend on the workload but will not be worse than an equivalent
>> single actuator disk.
>>
>> FS block allocators can definitely use the range information to distribute
>> writes among actuators. For reads, well, gains will depend on the workload,
>> obviously, but optimizations at the block IO scheduler level can improve things
>> too, especially if the drive is being used at a QD beyond its capability (that
>> is, requests are accumulated in the IO scheduler).
>>
>> Similar write optimization can be achieved by applications using block device
>> files directly. This series is intended for this case for now. FS and bloc IO
>> scheduler optimization can be added later.
>>
>>
> Rumours have it that Paolo Valente is working on adapting BFQ to utilize
> the range information for better actuator utilisation.

Paolo has a talk on this subject scheduled for SNIA SDC 2021.

https://storagedeveloper.org/events/sdc-2021/abstracts#hd-Walker

> And eventually one should modify filesystem utilities like xfs to adapt
> the metadata layout to multi-actuator drives.
> 
> The _real_ fun starts once the HDD manufactures starts putting out
> multi-actuator SMR drives :-)

Well, that does not change things that much in the end. The same constraints
remain, and the sector ranges will be aligned to zones. So no added difficulty.

> 
> Cheers,
> 
> Hannes
> 


-- 
Damien Le Moal
Western Digital Research




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux