RE: Curious randwrite results on raid10 md device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jerome

Precondition of an SSD generally requires writing the entire capacity of the drive up to 1.5 to 2 times.  If doing random operations, you would write using random writes of the same character (size and mix) as your test.  If sequential, likewise.  The point is that an SSD does garbage collection and wear-leveling operations in the background.  While the background operations executing, the performance of the drive is much slower.   Performance is unstable unless those background functions are running continuously which gives the sustained performance the drive.   If the device is idle for a while, it's able to catch up and you return to close to the out of box performance of the drive.  

When performing pure read tests, then the drive still needs to be preconditioned in the same fashion as the writes. But, unless your test measurement is short enough so the garbage collection is still running throughout the measurement period, you can let box idle for a while, and then run the read test after the drives have settled (though the measured performance can be different than when background functions are running, but should be stable)

Thus, writing a 10G file won't close to enough data.  You wouldn't have reached past the transition period and are probably just measuring the cache. It's not unusual to have preconditioning operations which take over a day before running the real test (dep on size of device).  Since you are raided, it's the size of the whole array.  Not fun.

An alternative to preconditioning is to just run your test until the results stabilize (experiment to determine some minimum amount of data or time to get past the transition period).  For example, run without a ramp period and use a long runtime with the IOps Log averaging every 1 to 5 seconds.  Then plot the log results against time to determine about how long it takes till the results stabilize, and then set your ramp time to greater than that duration.   Also, always do the IOps sampling in any regression tests, in order to verify that the results did not vary too much during the test and the results are still valid.

Sorry to say that any solid state storage performance has complications that are very different than HDDs. 
Hope that helps.

Kris Davis

-----Original Message-----
From: Jérôme Charaoui [mailto:jcharaoui@xxxxxxxxxxxxxxxxxx] 
Sent: Tuesday, February 6, 2018 2:31 PM
To: Kris Davis <Kris.Davis@xxxxxxx>; Sitsofe Wheeler <sitsofe@xxxxxxxxx>
Cc: fio <fio@xxxxxxxxxxxxxxx>
Subject: Re: Curious randwrite results on raid10 md device

Hi Kris,

Le 2018-02-06 à 10:24, Kris Davis a écrit :
> The performance variation you mention below doesn't sound unusual for a non-preconditioned SSD.  You didn't mention what sort of "preconditioning" you did to the SSDs prior to your measurements.   You may be unaware that SSDs require some usually extensive preconditioning operations to obtain stable results.  Take a look at the SNIA test methodology for a pretty good explanation (http://www.snia.org/sites/default/files/SSS_PTS_Enterprise_v1.1.pdf) I think starting around page 18.

I think you hit the nail on the head here. I didn't do any sort of preconditioning, as I didn't think it necessary. To test this, I created a logical volume and an ext4 filesystem on top the the md device and ran the same test except with a fixed-size IO file of 10G.

I noticed that at the beginning, fio "lays out" this file, which may be just enough "preconditioning" because then the random writes are much more stable at the beginning and the curve observed in the previous test is absent.

Thanks,

-- Jerome

��.n��������+%������w��{.n�������^n�r������&��z�ޗ�zf���h���~����������_��+v���)ߣ�

[Index of Archives]     [Linux Kernel]     [Linux SCSI]     [Linux IDE]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux