Hello fellow Cephers, My 12.2.2 cluster is pretty full so I've been adding new nodes/OSDs. Last week I added two new nodes with 12 OSDs each and they are still backfilling. I have max_backfills tuned quite low across the board to minimize client impact. Yesterday I brought two more nodes online each with 12 OSDs and added them to the crushmap under a staging root, planning to add those to root=default when the two from last week complete backfilling. When the OSDs processes came up they all did what I describe below and since it only takes two OSDs on different hosts... the mons started marking existing OSDs down. So I backed that out and am now just working with a single OSD on of the new nodes until I can figure this out. When the OSD process starts up it's listening on ports 6800 and 6801 on both the cluster and public interfaces. It successfully gets the current osdmap from a monitor and chooses 10 OSDs to peer with, all of which fail. It doesn't appear to be a basic networking issue; I turned up debug osd and ms to 20 and based on the following it looks like a successful ping/reply with the OSD peer (osd.0), but after a while the log says it's never heard from this OSD. 2019-03-14 14:17:42.350902 7fe698776700 10 osd.403 103451 _add_heartbeat_peer: new peer osd.0 10.8.78.23:6814/8498484 10.8.76.23:6805/8498484 2019-03-14 14:17:44.165460 7fe68df61700 1 -- 10.8.76.48:0/67279 --> 10.8.76.23:6805/8498484 -- osd_ping(ping e103451 stamp 2019-03-14 14:17:44.165415) v4 -- 0x55844222aa00 con 0 2019-03-14 14:17:44.165467 7fe68df61700 20 -- 10.8.76.48:0/67279 >> 10.8.76.23:6805/8498484 conn(0x558442368000 :-1 s=STATE_OPEN pgs=2349 cs=1 l=1).prepare_send_message m osd_ping(ping e103451 stamp 2019-03-14 14:17:44.165415) v4 2019-03-14 14:17:44.165471 7fe68df61700 20 -- 10.8.76.48:0/67279 >> 10.8.76.23:6805/8498484 conn(0x558442368000 :-1 s=STATE_OPEN pgs=2349 cs=1 l=1).prepare_send_message encoding features 2305244844532236283 0x55844222aa00 osd_ping(ping e103451 stamp 2019-03-14 14:17:44.165415) v4 2019-03-14 14:17:44.165691 7fe6a574e700 5 -- 10.8.76.48:0/67279 >> 10.8.76.23:6805/8498484 conn(0x558442368000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=2349 cs=1 l=1). rx osd.0 seq 1 0x55844206ba00 osd_ping(ping_reply e103451 stamp 2019-03-14 14:17:44.165415) v4 2019-03-14 14:17:44.165697 7fe6a574e700 1 -- 10.8.76.48:0/67279 <== osd.0 10.8.76.23:6805/8498484 1 ==== osd_ping(ping_reply e103451 stamp 2019-03-14 14:17:44.165415) v4 ==== 2004+0+0 (4204681659 0 0) 0x55844206ba00 con 0x558442368000 ... seq 2-6... 2019-03-14 14:17:57.468338 7fe68df61700 1 -- 10.8.76.48:0/67279 --> 10.8.76.23:6805/8498484 -- osd_ping(ping e103451 stamp 2019-03-14 14:17:57.468301) v4 -- 0x5584422e2c00 con 0 2019-03-14 14:17:57.468343 7fe68df61700 20 -- 10.8.76.48:0/67279 >> 10.8.76.23:6805/8498484 conn(0x558442368000 :-1 s=STATE_OPEN pgs=2349 cs=1 l=1).prepare_send_message m osd_ping(ping e103451 stamp 2019-03-14 14:17:57.468301) v4 2019-03-14 14:17:57.468348 7fe68df61700 20 -- 10.8.76.48:0/67279 >> 10.8.76.23:6805/8498484 conn(0x558442368000 :-1 s=STATE_OPEN pgs=2349 cs=1 l=1).prepare_send_message encoding features 2305244844532236283 0x5584422e2c00 osd_ping(ping e103451 stamp 2019-03-14 14:17:57.468301) v4 2019-03-14 14:17:57.468554 7fe6a574e700 5 -- 10.8.76.48:0/67279 >> 10.8.76.23:6805/8498484 conn(0x558442368000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=2349 cs=1 l=1). rx osd.0 seq 6 0x55844222a600 osd_ping(ping_reply e103451 stamp 2019-03-14 14:17:57.468301) v4 2019-03-14 14:17:57.468561 7fe6a574e700 1 -- 10.8.76.48:0/67279 <== osd.0 10.8.76.23:6805/8498484 6 ==== osd_ping(ping_reply e103451 stamp 2019-03-14 14:17:57.468301) v4 ==== 2004+0+0 (306125004 0 0) 0x55844222a600 con 0x558442368000 2019-03-14 14:18:04.266809 7fe6a1f89700 -1 osd.403 103451 heartbeat_check: no reply from 10.8.76.23:6805 osd.0 ever on either front or back, first ping sent 2019-03-14 14:17:44.165415 (cutoff 2019-03-14 14:17:44.266808) 2019-03-14 14:18:05.267163 7fe6a1f89700 -1 osd.403 103451 heartbeat_check: no reply from 10.8.76.23:6805 osd.0 ever on either front or back, first ping sent 2019-03-14 14:17:44.165415 (cutoff 2019-03-14 14:17:45.267163) 2019-03-14 14:18:06.267296 7fe6a1f89700 -1 osd.403 103451 heartbeat_check: no reply from 10.8.76.23:6805 osd.0 ever on either front or back, first ping sent 2019-03-14 14:17:44.165415 (cutoff 2019-03-14 14:17:46.267295) This whole time other OSDs are marking 403 down so after the grace period expires the monitor takes it down. cluster [INF] osd.403 failed (root=staging,host=chhq-supcphsn34) (2 reporters from different host after 25.757343 >= grace 21.560865) After about 45 seconds the OSD then stops listening on the aforementioned ports and binds to different ones. Is this normal? There's no obvious indication in the OSD log of why or when it does this. # date;ss -tlnp|grep 67279 Thu Mar 14 14:18:27 CDT 2019 LISTEN 0 128 10.8.78.40:6800 *:* users:(("ceph-osd",pid=67279,fd=14)) LISTEN 0 128 10.8.76.48:6800 *:* users:(("ceph-osd",pid=67279,fd=13)) LISTEN 0 128 10.8.76.48:6801 *:* users:(("ceph-osd",pid=67279,fd=16)) LISTEN 0 128 10.8.78.40:6801 *:* users:(("ceph-osd",pid=67279,fd=15)) # date;ss -tlnp|grep 67279 Thu Mar 14 14:18:28 CDT 2019 LISTEN 0 128 10.8.76.48:6800 *:* users:(("ceph-osd",pid=67279,fd=13)) LISTEN 0 128 10.8.76.48:6802 *:* users:(("ceph-osd",pid=67279,fd=16)) LISTEN 0 128 10.8.78.40:6802 *:* users:(("ceph-osd",pid=67279,fd=14)) LISTEN 0 128 10.8.78.40:6803 *:* users:(("ceph-osd",pid=67279,fd=15)) But the current osdmap still shows those old ports (probably because all communication from osd to mon has ceased until the next reportable event?) 103451 osd.403 up in weight 1 up_from 103451 up_thru 0 down_at 103437 last_clean_interval [103436,103442) 10.8.76.48:6800/67279 10.8.78.40:6800/67279 10.8.78.40:6801/67279 10.8.76.48:6801/67279 exists,up f664f770-2b92-43c4-818c-db8b482515c4 103456 osd.403 down in weight 1 up_from 103451 up_thru 0 down_at 103452 last_clean_interval [103436,103442) 10.8.76.48:6800/67279 10.8.78.40:6800/67279 10.8.78.40:6801/67279 10.8.76.48:6801/67279 exists f664f770-2b92-43c4-818c-db8b482515c4 At this point it doesn't appear I can add any OSDs to my cluster. No existing OSDs have been flapping so I don't think network congestion or heartbeat traffic in general is an issue. I'm hoping I've overlooked something simple. Any help is appreciated! Thanks, Josh _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com