Re: [LSF/MM TOPIC] SMR: Disrupting recording technology meriting a new class of storage device

Hannes Reinecke <hare@xxxxxxx> · Fri, 07 Feb 2014 14:46:20 +0100

On 02/07/2014 02:00 PM, Carlos Maiolino wrote:
> Hi,
> 
> On Sat, Feb 01, 2014 at 02:24:33AM +0000, Albert Chen wrote:
>> [LSF/MM TOPIC] SMR: Disrupting recording technology meriting
>> a new class of storage device
>>
>> Shingle Magnetic Recording is a disruptive technology that
>> delivers the next areal density gain for the HDD industry by
>> partially overlapping tracks. Shingling requires physical
>> writes to be sequential, and opens the question of how to
>> address this behavior at a system level. Two general approaches
>> contemplated are to either to do the block management in
>> the device or in the host storage stack/file system through
>> Zone Block Commands (ZBC).
>>
>> The use of ZBC to handle SMR block management yields several
>> benefits such as:
>> - Predictable performance and latency
>> - Faster development time
>> - Access to application and system level semantic information
>> - Scalability / Fewer Drive Resources
>> - Higher reliability
>>
>> Essential to a host managed approach (ZBC) is the openness of
>> Linux and its community is a good place for WD to validate and
>> seek feedback for our thinking - where in the Linux system stack
>> is the best place to add ZBC handling? at the Device Mapper layer?
>> or somewhere else in the storage stack? New ideas and comments
>> are appreciated.
> 
> If you add ZBC handling into the device-mapper layer, aren't you supposing that
> all SMR devices will be managed by device-mapper? This doesn't look right IMHO.
> These devices should be able to be managed via DM or either directly via de
> storage layer. And any other layers making use of these devices (like DM for
> example) should be able to communicate with them and send ZBC commands as
> needed.
> 
Precisely. Adding a new device type (and a new ULD to the SCSI
midlayer) seems to be the right idea here.
Then we could think of how to integrate this into the block layer;
eg we could identify the zones with partitions,
or mirror the zones via block_limits.

There is actually a good chance that we can tweak btrfs to
run unmodified on such a disk; after all, sequential writes
are not a big deal for btrfs. The only issue we might have
is that we might need to re-allocate blocks to free up zones.
But some btrfs developers have assured me this shouldn't be too hard.

Personally I don't like the idea of _having_ to use a device-mapper
module for these things. What I would like is giving the user a
choice; if there are specialized fs around which can deal with such
a disk (hello, ltfs :-) then fine. If not of course we should be
having a device-mapper module to hide the grubby details for
unsuspecting filesystems.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@xxxxxxx			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html