On Fri, Apr 9, 2021 at 8:39 PM Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote: > > On Fri, Apr 9, 2021 at 11:49 AM Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote: > > > > Thanks. I didn't see anything ultra obvious to me. > > > > But I did notice the nearfull warnings so I wonder if this cluster is > > churning through osdmaps? Did you see a large increase in inbound or > > outbound network traffic on this mon following the upgrade? > > Totally speculating here, but maybe there is an issue where you have > > some old clients, which can't decode an incremental osdmap from a > > nautilus mon, so the single mon is busy serving up these maps to the > > clients. > > > > Does the mon load decrease if you stop the osdmap churn?, e.g. by > > setting norebalance if that is indeed ongoing. > > > > Could you also share debug_ms = 1 for a minute of busy cpu mon? > > Here are the new logs with the debug_ms=1 for a bit. > https://owncloud.leblancnet.us/owncloud/index.php/s/1hvtJo3s2oLPpWn Something strange in this is there is one hammer client that is asking for nearly a million incremental osdmaps, seemingly every 30s: client.131831153 at 172.16.212.55 is asking for incrementals from 1170448..1987355 (see [1]) Can you try to evict/kill/block that client and see if your mon load drops? -- dan [1] -43> 2021-04-09 13:12:37.032 7f50de246700 5 mon.sun-storemon01@0(leader).osd e1987341 send_incremental [1170448..1987341] to client.131831153 2021-04-09 17:07:27.238 7f9fc83e4700 10 mon.sun-storemon01@0(leader) e45 handle_subscribe mon_subscribe({mdsmap=3914079+,monmap=0+,osdmap=1170448}) 2021-04-09 17:07:27.238 7f9fc83e4700 10 mon.sun-storemon01@0(leader).osd e1987355 check_osdmap_sub 0x55e2e2133de0 next 1170448 (onetime) 2021-04-09 17:07:27.238 7f9fc83e4700 5 mon.sun-storemon01@0(leader).osd e1987355 send_incremental [1170448..1987355] to client.131831153 2021-04-09 17:07:50.910 7f9fc83e4700 5 mon.sun-storemon01@0(leader) e45 dispatch_op client.131831153 v1:172.16.212.55:0/527701465 is not authenticated, dropping mon_subscribe({mdsmap=3914079+,monmap=0+,osdmap=1170448}) 2021-04-09 18:14:47.295 7f9fc83e4700 1 -- [v2:10.65.7.203:3300/0,v1:10.65.7.203:6789/0] <== client.131831153 v1:172.16.212.55:0/527701465 3 ==== mon_subscribe({mdsmap=3914127+,monmap=0+,osdmap=1170448}) ==== 85+0+0 (unknown 1413914345 0 0) 0x55e2dbc52c00 con 0x55e2e1cf5680 2021-04-09 18:15:17.006 7f9fc83e4700 1 -- [v2:10.65.7.203:3300/0,v1:10.65.7.203:6789/0] <== client.131831153 v1:172.16.212.55:0/527701465 2 ==== mon_subscribe({mdsmap=3914127+,monmap=0+,osdmap=1170448}) ==== 85+0+0 (unknown 1413914345 0 0) 0x55e2da565200 con 0x55e2df00a880 2021-04-09 18:15:17.278 7f9fc83e4700 1 -- [v2:10.65.7.203:3300/0,v1:10.65.7.203:6789/0] <== client.131831153 v1:172.16.212.55:0/527701465 3 ==== mon_subscribe({mdsmap=3914127+,monmap=0+,osdmap=1170448}) ==== 85+0+0 (unknown 1413914345 0 0) 0x55e2de443000 con 0x55e2ee3d8400 _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx