I just ran test on Samsung 850 Pro 500GB (how to interpret result of following output?) [root@compute-01 tmp]# fio --filename=/dev/sda --direct=1 --sync=1 --rw=write --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-test journal-test: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1 fio-3.1 Starting 1 process Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=76.0MiB/s][r=0,w=19.7k IOPS][eta 00m:00s] journal-test: (groupid=0, jobs=1): err= 0: pid=6969: Mon Jul 16 14:21:27 2018 write: IOPS=20.1k, BW=78.6MiB/s (82.5MB/s)(4719MiB/60001msec) clat (usec): min=36, max=4525, avg=47.22, stdev=16.65 lat (usec): min=36, max=4526, avg=47.57, stdev=16.69 clat percentiles (usec): | 1.00th=[ 39], 5.00th=[ 40], 10.00th=[ 40], 20.00th=[ 41], | 30.00th=[ 43], 40.00th=[ 48], 50.00th=[ 49], 60.00th=[ 50], | 70.00th=[ 50], 80.00th=[ 51], 90.00th=[ 52], 95.00th=[ 53], | 99.00th=[ 62], 99.50th=[ 65], 99.90th=[ 108], 99.95th=[ 363], | 99.99th=[ 396] bw ( KiB/s): min=72152, max=96464, per=100.00%, avg=80581.45, stdev=7032.18, samples=119 iops : min=18038, max=24116, avg=20145.34, stdev=1758.05, samples=119 lat (usec) : 50=71.83%, 100=28.06%, 250=0.03%, 500=0.08%, 750=0.01% lat (usec) : 1000=0.01% lat (msec) : 2=0.01%, 10=0.01% cpu : usr=9.44%, sys=31.95%, ctx=1209952, majf=0, minf=78 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwt: total=0,1207979,0, short=0,0,0, dropped=0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): WRITE: bw=78.6MiB/s (82.5MB/s), 78.6MiB/s-78.6MiB/s (82.5MB/s-82.5MB/s), io=4719MiB (4948MB), run=60001-60001msec Disk stats (read/write): sda: ios=0/1205921, merge=0/29, ticks=0/41418, in_queue=40965, util=68.35% On Mon, Jul 16, 2018 at 1:18 PM, Michael Kuriger <mk7193@xxxxxxxxx> wrote: > I dunno, to me benchmark tests are only really useful to compare different > drives. > > > > > > From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of > Paul Emmerich > Sent: Monday, July 16, 2018 8:41 AM > To: Satish Patel > Cc: ceph-users > > > Subject: Re: SSDs for data drives > > > > This doesn't look like a good benchmark: > > (from the blog post) > > dd if=/dev/zero of=/mnt/rawdisk/data.bin bs=1G count=20 oflag=direct > > 1. it writes compressible data which some SSDs might compress, you should > use urandom > > 2. that workload does not look like something Ceph will do to your disk, > like not at all > > If you want a quick estimate of an SSD in worst-case scenario: run the usual > 4k oflag=direct,dsync test (or better: fio). > > A bad SSD will get < 1k IOPS, a good one > 10k > > But that doesn't test everything. In particular, performance might degrade > as the disks fill up. Also, it's the absolute > > worst-case, i.e., a disk used for multiple journal/wal devices > > > > > > Paul > > > > 2018-07-16 10:09 GMT-04:00 Satish Patel <satish.txt@xxxxxxxxx>: > > https://blog.cypressxt.net/hello-ceph-and-samsung-850-evo/ > > > On Thu, Jul 12, 2018 at 3:37 AM, Adrian Saul > <Adrian.Saul@xxxxxxxxxxxxxxxxx> wrote: >> >> >> We started our cluster with consumer (Samsung EVO) disks and the write >> performance was pitiful, they had periodic spikes in latency (average of >> 8ms, but much higher spikes) and just did not perform anywhere near where >> we >> were expecting. >> >> >> >> When replaced with SM863 based devices the difference was night and day. >> The DC grade disks held a nearly constant low latency (contantly sub-ms), >> no >> spiking and performance was massively better. For a period I ran both >> disks in the cluster and was able to graph them side by side with the same >> workload. This was not even a moderately loaded cluster so I am glad we >> discovered this before we went full scale. >> >> >> >> So while you certainly can do cheap and cheerful and let the data >> availability be handled by Ceph, don’t expect the performance to keep up. >> >> >> >> >> >> >> >> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of >> Satish Patel >> Sent: Wednesday, 11 July 2018 10:50 PM >> To: Paul Emmerich <paul.emmerich@xxxxxxxx> >> Cc: ceph-users <ceph-users@xxxxxxxxxxxxxx> >> Subject: Re: SSDs for data drives >> >> >> >> Prices going way up if I am picking Samsung SM863a for all data drives. >> >> >> >> We have many servers running on consumer grade sad drives and we never >> noticed any performance or any fault so far (but we never used ceph >> before) >> >> >> >> I thought that is the whole point of ceph to provide high availability if >> drive go down also parellel read from multiple osd node >> >> >> >> Sent from my iPhone >> >> >> On Jul 11, 2018, at 6:57 AM, Paul Emmerich <paul.emmerich@xxxxxxxx> wrote: >> >> Hi, >> >> >> >> we‘ve no long-term data for the SM variant. >> >> Performance is fine as far as we can tell, but the main difference between >> these two models should be endurance. >> >> >> >> >> >> Also, I forgot to mention that my experiences are only for the 1, 2, and 4 >> TB variants. Smaller SSDs are often proportionally slower (especially >> below >> 500GB). >> >> >> >> Paul >> >> >> Robert Stanford <rstanford8896@xxxxxxxxx>: >> >> Paul - >> >> >> >> That's extremely helpful, thanks. I do have another cluster that uses >> Samsung SM863a just for journal (spinning disks for data). Do you happen >> to >> have an opinion on those as well? >> >> >> >> On Wed, Jul 11, 2018 at 4:03 AM, Paul Emmerich <paul.emmerich@xxxxxxxx> >> wrote: >> >> PM/SM863a are usually great disks and should be the default go-to option, >> they outperform >> >> even the more expensive PM1633 in our experience. >> >> (But that really doesn't matter if it's for the full OSD and not as >> dedicated WAL/journal) >> >> >> >> We got a cluster with a few hundred SanDisk Ultra II (discontinued, i >> believe) that was built on a budget. >> >> Not the best disk but great value. They have been running since ~3 years >> now >> with very few failures and >> >> okayish overall performance. >> >> >> >> We also got a few clusters with a few hundred SanDisk Extreme Pro, but we >> are not yet sure about their >> >> long-time durability as they are only ~9 months old (average of ~1000 >> write >> IOPS on each disk over that time). >> >> Some of them report only 50-60% lifetime left. >> >> >> >> For NVMe, the Intel NVMe 750 is still a great disk >> >> >> >> Be carefuly to get these exact models. Seemingly similar disks might be >> just >> completely bad, for >> >> example, the Samsung PM961 is just unusable for Ceph in our experience. >> >> >> >> Paul >> >> >> >> 2018-07-11 10:14 GMT+02:00 Wido den Hollander <wido@xxxxxxxx>: >> >> >> >> On 07/11/2018 10:10 AM, Robert Stanford wrote: >>> >>> In a recent thread the Samsung SM863a was recommended as a journal >>> SSD. Are there any recommendations for data SSDs, for people who want >>> to use just SSDs in a new Ceph cluster? >>> >> >> Depends on what you are looking for, SATA, SAS3 or NVMe? >> >> I have very good experiences with these drives running with BlueStore in >> them in SuperMicro machines: >> >> - SATA: Samsung PM863a >> - SATA: Intel S4500 >> - SAS: Samsung PM1633 >> - NVMe: Samsung PM963 >> >> Running WAL+DB+DATA with BlueStore on the same drives. >> >> Wido >> >>> Thank you >>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >> >> -- >> >> Paul Emmerich >> >> Looking for help with your Ceph cluster? Contact us at https://croit.io >> >> croit GmbH >> Freseniusstr. 31h >> 81247 München >> www.croit.io >> Tel: +49 89 1896585 90 >> >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> Confidentiality: This email and any attachments are confidential and may >> be >> subject to copyright, legal or some other professional privilege. They are >> intended solely for the attention and use of the named addressee(s). They >> may only be copied, distributed or disclosed with the consent of the >> copyright owner. If you have received this email by mistake or by breach >> of >> the confidentiality clause, please notify the sender immediately by return >> email and delete or destroy all copies of the email. Any confidentiality, >> privilege or copyright is not waived or lost because this email has been >> sent to you by mistake. > > > > > -- > > Paul Emmerich > > Looking for help with your Ceph cluster? Contact us at https://croit.io > > croit GmbH > Freseniusstr. 31h > 81247 München > www.croit.io > Tel: +49 89 1896585 90 _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com