2016-02-18 15:10 GMT+01:00 Dan van der Ster <dan@xxxxxxxxxxxxxx>: > Hi, > > Thanks for linking to a current update on this problem [1] [2]. I > really hope that new Ceph installations aren't still following that > old advice... it's been known to be a problem for around a year and a > half [3]. > That said, the "-n size=64k" wisdom was really prevalent a few years > ago, and I wonder how many old clusters are at risk today. Thanks for listing some more references, strange thing that I couldn't find these when I was looking into this issue a couple of weeks ago. Also wisdom seems to be spreading only slowly even inside the Ceph developer community, as e.g. cbt still used this setting until recently [5], which together with some other references I found made me use this as a default until last week. > I manage a > sufficiently large enough number of affected OSDs that I'll be willing > to try all other possibilities before reformatting them [4]. Today > they're rock solid stable on EL6 (with hammer), but the jewel release > is getting closer and that's when we'll need to upgrade to EL7. (I've > already upgraded one host to 7 and haven't seen any problems yet, but > that one sample doesn't offer much comfort for the rest.) Anyway, it's > great to hear that there's a patch in the works... Dave deserves > infinite thanks if this gets resolved. As a non-subscriber, [4] wasn't that useful to me, but from your comments I take it that the recommended solution also is to reformat. According to my tests up to now, even when the new patch eventually makes it into the kernel, it will only reduce the impact of the issue, but not completely resolve it. Memory may still get fragmented enough in order for the needed allocations to fail for some time, so although it will not stall operations completely, there will likely still be a performance impact. So in the end, reformatting may still be the safest solution. > [1] http://tracker.ceph.com/issues/6301 > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1278992 > [3] https://github.com/redhat-cip/puppet-ceph/commit/b9407efd4a8a25d452e493fb48ea048e4d36e070 > [4] https://access.redhat.com/solutions/1597523 [5] https://github.com/ceph/cbt/pull/85 _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com