> Op 13 februari 2017 om 16:49 schreef "Bernhard J. M. Grün" <bernhard.gruen@xxxxxxxxx>: > > > Hi, > > we are using SMR disks for backup purposes in our Ceph cluster. > We have had massive problems with those disks prior to upgrading to Kernel > 4.9.x. We also dropped XFS as filesystem and we now use btrfs (only for > those disks). > Since we did this we don't have such problems anymore. > We have kernel 4.9 there, but XFS is not SMR-aware so it doesn't help. I saw posts that some XFS work is on it's way, but it's not being actively developed. What I saw however is that you need to issue some flags on mkfs. Did you need to do that when formatting btrfs on the SMR disks? Wido > If you don't like btrfs you could try to use a journal disk for XFS itself > and also a journal disk for Ceph. I assume this will also solve many > problems as the XFS journal is rewritten often and SMR disks don't like > rewrites. > I think that is one reason why btrfs works smoother with those disks. > > Hope this helps > > Bernhard > > Wido den Hollander <wido@xxxxxxxx> schrieb am Mo., 13. Feb. 2017 um > 16:11 Uhr: > > > > > > Op 13 februari 2017 om 15:57 schreef Peter Maloney < > > peter.maloney@xxxxxxxxxxxxxxxxxxxx>: > > > > > > > > > Then you're not aware of what the SMR disks do. They are just slow for > > > all writes, having to read the tracks around, then write it all again > > > instead of just the one thing you really wanted to write, due to > > > overlap. Then to partially mitigate this, they have some tiny write > > > buffer like 8GB flash, and then they use that for the "normal" speed, > > > and then when it's full, you crawl (at least this is what the seagate > > > ones do). Journals aren't designed to solve that... they help prevent > > > the sync load on the osd, but don't somehow make the throughput higher > > > (at least not sustained). Even if the journal was perfectly designed for > > > performance, it would still do absolutely nothing if it's full and the > > > disk is still busy with the old flushing. > > > > > > > Well, that explains indeed. I wasn't aware of the additional buffer inside > > a SMR disk. > > > > I was asked to look at this system for somebody who bought SMR disks > > without knowing. As I never touch these disks I found the behavior odd. > > > > The buffer explains it a lot better, wasn't aware that SMR disks have that. > > > > SMR shouldn't be used in Ceph without proper support in Bluestore or XFS > > aware SMR. > > > > Wido > > > > > > > > On 02/13/17 15:49, Wido den Hollander wrote: > > > > Hi, > > > > > > > > I have a odd case with SMR disks in a Ceph cluster. Before I continue, > > yes, I am fully aware of SMR and Ceph not playing along well, but there is > > something happening which I'm not able to fully explain. > > > > > > > > On a 2x replica cluster with 8TB Seagate SMR disks I can write with > > about 30MB/sec to each disk using a simple RADOS bench: > > > > > > > > $ rados bench -t 1 > > > > $ time rados put 1GB.bin > > > > > > > > Both ways I found out that the disk can write at that rate. > > > > > > > > Now, when I start a benchmark with 32 threads it writes fine. Not > > super fast, but it works. > > > > > > > > After 15 minutes or so various disks go to 100% busy and just stay > > there. These OSDs are being marked as down and some even commit suicide due > > to threads timing out. > > > > > > > > Stopping the RADOS bench and starting the OSDs again resolves the > > situation. > > > > > > > > I am trying to explain what's happening. I'm aware that SMR isn't very > > good at Random Writes. To partially overcome this there are Intel DC 3510s > > in there as Journal SSDs. > > > > > > > > Can anybody explain why this 100% busy pops up after 15 minutes or so? > > > > > > > > Obviously it would the best if BlueStore had SMR support, but for now > > it's just Filestore with XFS on there. > > > > > > > > Wido > > > > _______________________________________________ > > > > ceph-users mailing list > > > > ceph-users@xxxxxxxxxxxxxx > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > > > > _______________________________________________ > > > ceph-users mailing list > > > ceph-users@xxxxxxxxxxxxxx > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > -- > Freundliche Grüße > > Bernhard J. M. Grün, Püttlingen, Deutschland _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com