On 2020/09/05 22:38, Ian S. Worthington wrote: > I'm trying to establish if a new disk is SMR or not, or has any other > characteristics that would make it unsuitable for use in a zfs array. > > CrystalDiskMark suggests it has a speed of 6~8 MB/s in its RND4K testing. > > iiuc SMR disks contain a CMR area, possibly of variable size, which is used as > a cache, so to test a drive I need to ensure I fill this cache to the drive is > forced to start shingling. That is not necessarily true. One can handle the SMR sequential write constraint using a log structured approach that does not require any CMR caching. It really depends on how the disk FW is implemented, but generally, that is not public information unfortunately. > As the disk is 14TB, my first test used: > > sudo fio --name TEST --eta-newline=5s --filename=/dev/sda --rw=randwrite > --size=100t --io_size=14t --ioengine=libaio --iodepth=1 --direct=1 > --numjobs=1 --runtime=10h --group_reporting > > which reported: > > TEST: (groupid=0, jobs=1): err= 0: pid=4685: Sat Sep 5 07:42:02 2020 > write: IOPS=490, BW=1962KiB/s (2009kB/s)(67.4GiB/36000002msec); 0 zone > resets > slat (usec): min=16, max=10242, avg=41.02, stdev=11.10 > clat (usec): min=17, max=371540, avg=1980.75, stdev=1016.94 > lat (usec): min=283, max=371587, avg=2024.00, stdev=1016.92 > clat percentiles (usec): > | 1.00th=[ 486], 5.00th=[ 594], 10.00th=[ 1074], 20.00th=[ 1418], > | 30.00th=[ 1565], 40.00th=[ 1713], 50.00th=[ 1876], 60.00th=[ 2040], > | 70.00th=[ 2245], 80.00th=[ 2474], 90.00th=[ 2933], 95.00th=[ 3589], > | 99.00th=[ 4686], 99.50th=[ 5211], 99.90th=[ 8356], 99.95th=[11863], > | 99.99th=[21627] > bw ( KiB/s): min= 832, max= 7208, per=100.00%, avg=1961.66, stdev=105.29, > samples=72000 > iops : min= 208, max= 1802, avg=490.40, stdev=26.31, samples=72000 > > I have a number of concerns about this test: > > 1. Why is the average speed, 2MB/s, so much lower than that reported by > CrystalDiskMark? Likely because CrystalDiskMark is very short and does not trigger internal sector management (GC) by the disk. Your 10h run most likely did. > 2. After running for 10 hours, only 67 GiB were written. This could easily > not yet have filled any CMR cache on a SMR disk, rendering the test > worthless. Likely no. Whatever CMR space the disk has (if any at all) was likely filled. The internal disk sector movements to handle SMR sequential write constraint is causing enormous overhead and leading to 67GB written only. Your 2M random write test is the worst possible for a drive managed SMR disk. You simply are seeing what the drive performance is given the horrible conditions it is subjected to. > > I then ran some 5m tests, using different blocksizes in the command > > sudo fio --name TEST --eta-newline=5s --filename=/dev/sda --rw=randwrite > --size=100t --io_size=14t --ioengine=libaio --iodepth=1 --direct=1 > --numjobs=1 --runtime=5m --group_reporting --blocksize=xxx > > with the result: > > blksize speed(MB/s) IOPS > 4k 2 490 > 1M 100 97 > 10M 130 12 > 100M 160 1~2 > 1G 160 - > > 3. I'm considering running a dual test, where I first write, say 10TB data > with a blocksize of 1M (28 hours), followed by 10 hours of 4k writes again. > Although the 1M block contents will be sequential data, can I assume that > enough of them will do via any CMR cache in order to fill it up and reveal any > slow down? On Linux, one easy thing to check is to look at: cat /sys/block/<disk name>/device/scsi_disk/X:Y:Z:N/zoned_cap A drive managed SMR disk that is no hiding its true nature will say "drive-managed". You will need kernel 5.8 to have this attribute files. Otherwise, you can use SG to inspect the VPD page 0xB1 (block device characteristics). Look for the value of bits 4-5 of byte 8 (ZONED field). If the value is 2 (10b), then your disk is a drive managed SMR disk. -- Damien Le Moal Western Digital Research