On Thu, Feb 18, 2016 at 3:46 PM, Jens Rosenboom <j.rosenboom@xxxxxxxx> wrote: > 2016-02-18 15:10 GMT+01:00 Dan van der Ster <dan@xxxxxxxxxxxxxx>: >> Hi, >> >> Thanks for linking to a current update on this problem [1] [2]. I >> really hope that new Ceph installations aren't still following that >> old advice... it's been known to be a problem for around a year and a >> half [3]. >> That said, the "-n size=64k" wisdom was really prevalent a few years >> ago, and I wonder how many old clusters are at risk today. > > Thanks for listing some more references, strange thing that I couldn't > find these when I was looking into this issue a couple of weeks ago. > > Also wisdom seems to be spreading only slowly even inside the Ceph > developer community, as e.g. cbt still used this setting until > recently [5], which together with some other references I found made > me use this as a default until last week. Whoa good catch! >> I manage a >> sufficiently large enough number of affected OSDs that I'll be willing >> to try all other possibilities before reformatting them [4]. Today >> they're rock solid stable on EL6 (with hammer), but the jewel release >> is getting closer and that's when we'll need to upgrade to EL7. (I've >> already upgraded one host to 7 and haven't seen any problems yet, but >> that one sample doesn't offer much comfort for the rest.) Anyway, it's >> great to hear that there's a patch in the works... Dave deserves >> infinite thanks if this gets resolved. > > As a non-subscriber, [4] wasn't that useful to me, but from your > comments I take it that the recommended solution also is to reformat. Subscriber or not, everything I've read up to now suggests that reformatting is the only solution. There was a partial fix mentioned in [1], which I've confirmed is present in the EL7 kernels. But since I'm not able to reproduce the problem, I wasn't sure if that patch fixed it, or... Anyway, your thread shows we're all still at risk. Thanks! > According to my tests up to now, even when the new patch eventually > makes it into the kernel, it will only reduce the impact of the issue, > but not completely resolve it. Memory may still get fragmented enough > in order for the needed allocations to fail for some time, so although > it will not stall operations completely, there will likely still be a > performance impact. So in the end, reformatting may still be the > safest solution. BTW, we run OSD servers with vm.min_free_kbytes=1048576 -- this was some other old wisdom intended to make fragmented memory less likely. I have no idea if that is still good advice, but maybe... try it? -- Dan >> [1] http://tracker.ceph.com/issues/6301 >> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1278992 >> [3] https://github.com/redhat-cip/puppet-ceph/commit/b9407efd4a8a25d452e493fb48ea048e4d36e070 >> [4] https://access.redhat.com/solutions/1597523 > > [5] https://github.com/ceph/cbt/pull/85 _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com