On Sun, 14 Jul 2013, Stefan Priebe - Profihost AG wrote: > Hi sage, > > Am 14.07.2013 um 17:01 schrieb Sage Weil <sage@xxxxxxxxxxx>: > > > On Sun, 14 Jul 2013, Stefan Priebe wrote: > >> Hello list, > >> > >> might this be a problem due to having too much PGs? I've 370 per OSD instead > >> of having 33 / OSD (OSDs*100/3). > > > > That might exacerbate it. > > > > Can you try setting > > > > osd min pg log entries = 50 > > osd max pg log entries = 100 > > What does that exactly do? And why is a restart of all osds needed. Thanks! This limits the size of the pg log. > > > across your cluster, restarting your osds, and see if that makes a > > difference? I'm wondering if this is a problem with pg log rewrites after > > peering. Note that adding that option and restarting isn't enough to > > trigger the trim; you have to hit the cluster with some IO too, and (if > > this is the source of your problem) the trim itself might be expensive. > > So add it, restart, do a bunch of io (to all pools/pgs if you can), and > > then see if the problem is still present? > > Will try can't produce a write to every pg. it's a prod. Cluster with > KVM rbd. But it has 800-1200 iop/s per second. Hmm, if this is a production cluster, I would be careful, then! Setting the pg logs too short can lead to backfill, which is very expensive (as you know). The defaults are 3000 / 10000, so maybe try something less aggressive like changing min to 500? Also, I think ceph osd tell \* injectargs '--osd-min-pg-log-entries 500' should work as well. But again, be aware that lowering the value will incur a trim that may in itself be a bit expensive (if this is the source of the problem). It is probably worth watching ceph pg dump | grep $some_random_pg and watching the 'v' column over time (say, a minute or two) to see how quickly pg events are being generated on your cluster. This will give you a sense of how much time 500 (or however many) pg log entries covers! sage > > > > > Also note that the lower osd min pg log entries means that the osd cannot > > be down as long without requiring a backfill (50 ios per pg). These > > probably aren't the values that we want, but I'd like to find out whether > > the pg log rewrites after peering in cuttlefish are the culprit here. > > > > > > Thanks! > > > >> Is there any plan for PG merging? > > > > Not right now. :( I'll talk to Sam, though, to see how difficult it > > would be given the split approach we settled on. > > > > Thanks! > > sage > > > > > >> > >> Stefan > >>> Hello list, > >>> > >>> anyone else here who always has problems bringing back an offline OSD? > >>> Since cuttlefish i'm seeing slow requests for the first 2-5 minutes > >>> after bringing an OSD oinline again but that's so long that the VMs > >>> crash as they think their disk is offline... > >>> > >>> Under bobtail i never had any problems with that. > >>> > >>> Please HELP! > >>> > >>> Greets, > >>> Stefan > >> _______________________________________________ > >> ceph-users mailing list > >> ceph-users@xxxxxxxxxxxxxx > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > >> > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com