On 2020/05/08 18:03, Hannes Reinecke wrote: > Hi all, > > this patchset adds a new metadata version 2 for dm-zoned, which brings the > following improvements: > > - UUIDs and labels: Adding three more fields to the metadata containing > the dm-zoned device UUID and label, and the device UUID. This allows > for an unique identification of the devices, so that several dm-zoned > sets can coexist and have a persistent identification. > - Extend random zones by an additional regular disk device: A regular > block device can be added together with the zoned block device, providing > additional (emulated) random write zones. With this it's possible to > handle sequential zones only devices; also there will be a speed-up if > the regular block device resides on a fast medium. The regular block device > is placed logically in front of the zoned block device, so that metadata > and mapping tables reside on the regular block device, not the zoned device. > - Tertiary superblock support: In addition to the two existing sets of metadata > another, tertiary, superblock is written to the first block of the zoned > block device. This superblock is for identification only; the generation > number is set to '0' and the block itself it never updated. The addition > metadate like bitmap tables etc are not copied. > > To handle this, some changes to the original handling are introduced: > - Zones are now equidistant. Originally, runt zones were ignored, and > not counted when sizing the mapping tables. With the dual device setup > runt zones might occur at the end of the regular block device, making > direct translation between zone number and sector/block number complex. > For metadata version 2 all zones are considered to be of the same size, > and runt zones are simply marked as 'offline' to have them ignored when > allocating a new zone. > - The block number in the superblock is now the global number, and refers to > the location of the superblock relative to the resulting device-mapper > device. Which means that the tertiary superblock contains absolute block > addresses, which needs to be translated to the relative device addresses > to find the referenced block. > > There is an accompanying patchset for dm-zoned-tools for writing and checking > this new metadata. > > As usual, comments and reviews are welcome. I gave this series a good round of testing. See the attached picture for the results. The test is this: 1) Setup dm-zoned 2) Format and mount with mkfs.ext4 -E packed_meta_blocks=1 /dev/mapper/xxx 3) Create file random in size between 1 and 4MB and measure user seen throughput over 100 files. 3) Run that for 2 hours I ran this over a 15TB SMR drive single drive setup, and on the same drive + a 500GB m.2 ssd added. For the single drive case, the usual 3 phases can be seen: start writing at about 110MB/s, everything going to conventional zones (note conv zones are in the middle of the disk, hence the low-ish throughput). Then after about 400s, reclaim kicks in and the throughput drops to 60-70 MB/s. As reclaim cannot keep up under this heavy write workload, performance drops to 20-30MB/s after 800s. All good, without any idle time for reclaim to do its job, this is all expected. For the dual drive case, things are more interesting: 1) The first phase is longer as overall, there is more conventional space (500G ssd + 400G on SMR drive). So we see the SSD speed first (~425MB/s), then the drive speed (100 MB/s), slightly lower than the single drive case toward the end as reclaim triggers. 2) Some recovery back to ssd speed, then a long phase at half the speed of the ssd as writes go to ssd and reclaim is running moving data out of the ssd onto the disk. 3) Then a long phase at 25MB/s due to SMR disk reclaim. 4) back up to half the ssd speed. No crashes, no data corruption, all good. But is does look like we can improve on performance further by preventing using the drive conventional zones as "buffer" zones. If we let those be the final resting place of data, the SMR disk only reclaim would not kick in and hurt performance as seen here. That I think can all be done on top of this series though. Let's get this in first. Mike, I am still seeing the warning: [ 1827.839756] device-mapper: table: 253:1: adding target device sdj caused an alignment inconsistency: physical_block_size=4096, logical_block_size=4096, alignment_offset=0, start=0 [ 1827.856738] device-mapper: table: 253:1: adding target device sdj caused an alignment inconsistency: physical_block_size=4096, logical_block_size=4096, alignment_offset=0, start=0 [ 1827.874031] device-mapper: table: 253:1: adding target device sdj caused an alignment inconsistency: physical_block_size=4096, logical_block_size=4096, alignment_offset=0, start=0 [ 1827.891086] device-mapper: table: 253:1: adding target device sdj caused an alignment inconsistency: physical_block_size=4096, logical_block_size=4096, alignment_offset=0, start=0 when mixing 512B sector and 4KB sector devices. Investigating now. Hannes, I pushed some minor updates to dmzadm staging branch on top of your changes. > > Changes to v4: > - Add reviews from Damien > - Silence logging output as suggested by Mike Snitzer > - Fixup compilation on 32bit archs > > Changes to v3: > - Reorder devices such that the regular device is always at position 0, > and the zoned device is always at position 1. > - Split off dmz_dev_is_dying() into a separate patch > - Include reviews from Damien > > Changes to v2: > - Kill dmz_id() > - Include reviews from Damien > - Sanitize uuid handling as suggested by John Dorminy > > > Hannes Reinecke (14): > dm-zoned: add 'status' and 'message' callbacks > dm-zoned: store zone id within the zone structure and kill dmz_id() > dm-zoned: use array for superblock zones > dm-zoned: store device in struct dmz_sb > dm-zoned: move fields from struct dmz_dev to dmz_metadata > dm-zoned: introduce dmz_metadata_label() to format device name > dm-zoned: Introduce dmz_dev_is_dying() and dmz_check_dev() > dm-zoned: remove 'dev' argument from reclaim > dm-zoned: replace 'target' pointer in the bio context > dm-zoned: use dmz_zone_to_dev() when handling metadata I/O > dm-zoned: add metadata logging functions > dm-zoned: Reduce logging output on startup > dm-zoned: ignore metadata zone in dmz_alloc_zone() > dm-zoned: metadata version 2 > > drivers/md/dm-zoned-metadata.c | 664 +++++++++++++++++++++++++++++++---------- > drivers/md/dm-zoned-reclaim.c | 88 +++--- > drivers/md/dm-zoned-target.c | 376 +++++++++++++++-------- > drivers/md/dm-zoned.h | 35 ++- > 4 files changed, 825 insertions(+), 338 deletions(-) > -- Damien Le Moal Western Digital Research
Attachment:
dm-zoned.png
Description: dm-zoned.png
-- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel