David - I'm new to Ceph myself, so can't point out any smoking guns - but your problem "feels" like a network issue. I suggest you check all of your OSD/Mon/Clients network interfaces. Check for errors, check that they are negotiating the same link speed/type with your switches (if you have LLDP enabled on your switches, this will help), most importantly - check that you have MTU matching - I.E. - if you are using Jumbo frames (eg 9000 MTU) on your hosts, that your switches are *also* supporting that with appropriate packet overhead (eg 9128). If you have your hosts set to 9000 - and your switches to 1500 - you'll see this exact behavior... Hopefully that helps some ... ~~shane On 7/17/15, 8:57 AM, "ceph-users on behalf of J David" <ceph-users-bounces@xxxxxxxxxxxxxx on behalf of j.david.lists@xxxxxxxxx> wrote: >On Fri, Jul 17, 2015 at 11:15 AM, Quentin Hartman ><qhartman@xxxxxxxxxxxxxxxxxxx> wrote: >> That looks a lot like what I was seeing initially. The OSDs getting >>marked >> out was relatively rare and it took a bit before I saw it. > >Our problem is "most of the time" and does not appear confined to a >specific ceph cluster node or OSD: > >$ sudo fgrep 'waiting for subops' ceph.log | sed -e 's/.* v4 //' | >sort | uniq -c | sort -n > 1 currently waiting for subops from 0 > 1 currently waiting for subops from 10 > 1 currently waiting for subops from 11 > 1 currently waiting for subops from 12 > 1 currently waiting for subops from 3 > 1 currently waiting for subops from 7 > 2 currently waiting for subops from 13 > 2 currently waiting for subops from 16 > 2 currently waiting for subops from 4 > 3 currently waiting for subops from 15 > 4 currently waiting for subops from 6 > 4 currently waiting for subops from 8 > 7 currently waiting for subops from 2 > >Node f16: 0, 2, and 3 (3 out of 4) >Node f17: 4, 6, 7, 8, 10, 11, 12, 13 and 15 (9 out of 12) >Node f18: 16 (1 out of 12) > >So f18 seems like the odd man out, in that it has *less* problems than >the other two. > >There are a grand total of 2 RX errors across all the interfaces on >all three machines. (Each one has dual 10G interfaces bonded together >as active/failover.) > >The OSD log for the worst offender above (2) says: > >2015-07-17 08:52:05.441607 7f562ea0c700 0 log [WRN] : 1 slow >requests, 1 included below; oldest blocked for > 30.119568 secs > >2015-07-17 08:52:05.441622 7f562ea0c700 0 log [WRN] : slow request >30.119568 seconds old, received at 2015-07-17 08:51:35.321991: >osd_sub_op(client.32913524.0:3149584 2.249 >2792c249/rbd_data.15322ae8944a.000000000011b487/head//2 [] v >10705'944603 snapset=0=[]:[] snapc=0=[]) v11 currently started > >2015-07-17 08:52:43.229770 7f560833f700 0 -- >192.168.2.216:6813/16029552 >> 192.168.2.218:6810/7028653 >pipe(0x25265180 sd=25 :6813 s=2 pgs=23894 cs=41 l=0 >c=0x22be4c60).fault with nothing to send, going to standby > >There are a bunch of those "fault with nothing to send, going to >standby" messages. > >> The messages were like "So-and-so incorrectly marked us >> out" IIRC. > >Nothing like that. Nor, with "ceph -w" running constantly, any >reference to anything being marked out at any point, even when >problems are severe. > >Thanks! >_______________________________________________ >ceph-users mailing list >ceph-users@xxxxxxxxxxxxxx >http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com