Re: List of SSDs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On 25 Feb 2016, at 22:41, Shinobu Kinjo <skinjo@xxxxxxxxxx> wrote:
> 
>> Just beware of HBA compatibility, even in passthrough mode some crappy firmwares can try and be smart about what you can do (LSI-Avago, I'm looking your way for crippling TRIM, seriously WTH).
> 
> This is very good to know.
> Can anybody elaborate on this a bit more?
> 

To some degree, it's been a while since I investigated this.
For TRIM/discard to work, you need to have
1) working TRIM/discard command on the drive
2) the scsi/libata layer (?) somehow detect how many blocks can be discarded at once and what the block size is etc.
those properties are found in /sys/block/xxx/queue/discard_*

3) filesystem that supports discard (and it looks at those discard_* properties to determine when/what to discard).
4) there are also flags (hdparm -I shows them) what happens after trim - either the data is zeroed or random data is returned (it is possible to TRIM a sector and then read the original data - it doesn't actually need to erase anything, it simply marks that sector as unused in bitmap and GC does it's magic when it feels like it, if ever)

RAID controllers need to have some degree of control over this, because they need to be able to compare the drive contents when scrubbing (the same probably somehow applies to mdraid) either by maintaining some bitmap of used blocks or by trusting the drives to be deterministic. If you discard a sector on a HW RAID, both drives need to start returning the same data or scrubbing will fail. Some drives guarantee that and some don't.
You either have DRAT - Deterministic Read After Trim (but this only guarantees that data don't change, but they can be random)
or you have DZAT - Deterministic read Zero After Trim (subsequent reads only return NULLs)
or you can have none of the above (whcih is no big deal, except for RAID).

Even though I don't use LSI HBAs in IR (RAID) mode, the firmware doesn't like that my drives don't have DZAT/DRAT (or rather didn't, this doesn't apply to the Intels I have now) and crippled the discard_* parameters to try and disallow the use of TRIM. And it mostly works because the filesystem doesn't have the discard_* parameters it needs for discard to work...
... BUT it doesn't cripple the TRIM command itself so running hdparm --trim-sector-ranges still works (lol) and I suppose if those discard_* parameters were made read/write (actually I found a patch that does exactly that back then) then we could re-enable trim in spite of the firmware nonsense, but with modern SSDs it's mostly pointless anyway and LSI sucks, so who cares :-)

*
Sorry if I mixed some layers, maybe it's not filesystem that calls discard but another layer in kernel, also not sure how exactly discard_* values are detected and when etc., but in essence it works like that.

Jan



> Rgds,
> Shinobu
> 
> ----- Original Message -----
> From: "Jan Schermer" <jan@xxxxxxxxxxx>
> To: "Nick Fisk" <nick@xxxxxxxxxx>
> Cc: "Robert LeBlanc" <robert@xxxxxxxxxxxxx>, "Shinobu Kinjo" <skinjo@xxxxxxxxxx>, ceph-users@xxxxxxxxxxxxxx
> Sent: Thursday, February 25, 2016 11:10:41 PM
> Subject: Re:  List of SSDs
> 
> We are very happy with S3610s in our cluster.
> We had to flash a new firmware because of latency spikes (NCQ-related), but had zero problems after that...
> Just beware of HBA compatibility, even in passthrough mode some crappy firmwares can try and be smart about what you can do (LSI-Avago, I'm looking your way for crippling TRIM, seriously WTH).
> 
> Jan
> 
> 
>> On 25 Feb 2016, at 14:48, Nick Fisk <nick@xxxxxxxxxx> wrote:
>> 
>> There’s two factors really
>> 
>> 1.       Suitability for use in ceph
>> 2.       Number of people using them
>> 
>> For #1, there are a number of people using various different drives, so lots of options. The blog articled linked is a good place to start.
>> 
>> For #2 and I think this is quite important. Lots of people use the S3xx’s intel drives. This means any problems you face will likely have a lot of input from other people. Also you are less likely to face surprises, as most usage cases have already been covered. 
>> 
>> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx <mailto:ceph-users-bounces@xxxxxxxxxxxxxx>] On Behalf Of Robert LeBlanc
>> Sent: 25 February 2016 05:56
>> To: Shinobu Kinjo <skinjo@xxxxxxxxxx <mailto:skinjo@xxxxxxxxxx>>
>> Cc: ceph-users <ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>>
>> Subject: Re:  List of SSDs
>> 
>> We are moving to the Intel S3610, from our testing it is a good balance between price, performance and longevity. But as with all things, do your testing ahead of time. This will be our third model of SSDs for our cluster. The S3500s didn't have enough life and performance tapers off add it gets full. The Micron M600s looked good with the Sebastian journal tests, but once in use for a while go downhill pretty bad. We also tested Micron M500dc drives and they were on par with the S3610s and are more expensive and are closer to EoL. The S3700s didn't have quite the same performance as the S3610s, but they will last forever and are very stable in terms of performance and have the best power loss protection. 
>> 
>> Short answer is test them for yourself to make sure they will work. You are pretty safe with the Intel S3xxx drives. The Micron M500dc is also pretty safe based on my experience. It had also been mentioned that someone has had good experience with a Samsung DC Pro (has to have both DC and Pro in the name), but we weren't able to get any quick enough to test so I can't vouch for them. 
>> 
>> Sent from a mobile device, please excuse any typos.
>> 
>> On Feb 24, 2016 6:37 PM, "Shinobu Kinjo" <skinjo@xxxxxxxxxx <mailto:skinjo@xxxxxxxxxx>> wrote:
>> Hello,
>> 
>> There has been a bunch of discussion about using SSD.
>> Does anyone have any list of SSDs describing which SSD is highly recommended, which SSD is not.
>> 
>> Rgds,
>> Shinobu
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com <http://xo4t.mj.am/link/xo4t/x0557gim2n74/1/sbo9Blye4QJEav_CN9NrlA/aHR0cDovL2xpc3RzLmNlcGguY29tL2xpc3RpbmZvLmNnaS9jZXBoLXVzZXJzLWNlcGguY29t>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux