> Op 13 februari 2017 om 15:57 schreef Peter Maloney <peter.maloney@xxxxxxxxxxxxxxxxxxxx>: > > > Then you're not aware of what the SMR disks do. They are just slow for > all writes, having to read the tracks around, then write it all again > instead of just the one thing you really wanted to write, due to > overlap. Then to partially mitigate this, they have some tiny write > buffer like 8GB flash, and then they use that for the "normal" speed, > and then when it's full, you crawl (at least this is what the seagate > ones do). Journals aren't designed to solve that... they help prevent > the sync load on the osd, but don't somehow make the throughput higher > (at least not sustained). Even if the journal was perfectly designed for > performance, it would still do absolutely nothing if it's full and the > disk is still busy with the old flushing. > Well, that explains indeed. I wasn't aware of the additional buffer inside a SMR disk. I was asked to look at this system for somebody who bought SMR disks without knowing. As I never touch these disks I found the behavior odd. The buffer explains it a lot better, wasn't aware that SMR disks have that. SMR shouldn't be used in Ceph without proper support in Bluestore or XFS aware SMR. Wido > > On 02/13/17 15:49, Wido den Hollander wrote: > > Hi, > > > > I have a odd case with SMR disks in a Ceph cluster. Before I continue, yes, I am fully aware of SMR and Ceph not playing along well, but there is something happening which I'm not able to fully explain. > > > > On a 2x replica cluster with 8TB Seagate SMR disks I can write with about 30MB/sec to each disk using a simple RADOS bench: > > > > $ rados bench -t 1 > > $ time rados put 1GB.bin > > > > Both ways I found out that the disk can write at that rate. > > > > Now, when I start a benchmark with 32 threads it writes fine. Not super fast, but it works. > > > > After 15 minutes or so various disks go to 100% busy and just stay there. These OSDs are being marked as down and some even commit suicide due to threads timing out. > > > > Stopping the RADOS bench and starting the OSDs again resolves the situation. > > > > I am trying to explain what's happening. I'm aware that SMR isn't very good at Random Writes. To partially overcome this there are Intel DC 3510s in there as Journal SSDs. > > > > Can anybody explain why this 100% busy pops up after 15 minutes or so? > > > > Obviously it would the best if BlueStore had SMR support, but for now it's just Filestore with XFS on there. > > > > Wido > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com