Then you're not aware of what the SMR disks do. They are just slow for all writes, having to read the tracks around, then write it all again instead of just the one thing you really wanted to write, due to overlap. Then to partially mitigate this, they have some tiny write buffer like 8GB flash, and then they use that for the "normal" speed, and then when it's full, you crawl (at least this is what the seagate ones do). Journals aren't designed to solve that... they help prevent the sync load on the osd, but don't somehow make the throughput higher (at least not sustained). Even if the journal was perfectly designed for performance, it would still do absolutely nothing if it's full and the disk is still busy with the old flushing. On 02/13/17 15:49, Wido den Hollander wrote: > Hi, > > I have a odd case with SMR disks in a Ceph cluster. Before I continue, yes, I am fully aware of SMR and Ceph not playing along well, but there is something happening which I'm not able to fully explain. > > On a 2x replica cluster with 8TB Seagate SMR disks I can write with about 30MB/sec to each disk using a simple RADOS bench: > > $ rados bench -t 1 > $ time rados put 1GB.bin > > Both ways I found out that the disk can write at that rate. > > Now, when I start a benchmark with 32 threads it writes fine. Not super fast, but it works. > > After 15 minutes or so various disks go to 100% busy and just stay there. These OSDs are being marked as down and some even commit suicide due to threads timing out. > > Stopping the RADOS bench and starting the OSDs again resolves the situation. > > I am trying to explain what's happening. I'm aware that SMR isn't very good at Random Writes. To partially overcome this there are Intel DC 3510s in there as Journal SSDs. > > Can anybody explain why this 100% busy pops up after 15 minutes or so? > > Obviously it would the best if BlueStore had SMR support, but for now it's just Filestore with XFS on there. > > Wido > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com