Hi Everyone, For a new build we tested the 5.4 kernel which wasn't working well for us and ultimately changed to Ubuntu 20.04.3 HWE and 5.11 kernel. We can now get all OSDs more or less up, but on a clean OS reinstall we are seeing this type of behavior that is causing slow ops even before any pool and filesystem has been created. We are using LACP bonds w/MTU 9000 for both front and back networks. Both networks are 100G. I've tried increasing bind ports from default of 7300. Any ideas? Oct 18 16:17:45 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:45.248+0000 7f5363c2c080 0 _get_class not permitted to load lua Oct 18 16:17:45 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:45.248+0000 7f5363c2c080 0 <cls> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic> Oct 18 16:17:45 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:45.248+0000 7f5363c2c080 0 _get_class not permitted to load sdk Oct 18 16:17:45 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:45.248+0000 7f5363c2c080 0 _get_class not permitted to load kvs Oct 18 16:17:45 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:45.248+0000 7f5363c2c080 0 <cls> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic> Oct 18 16:17:45 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:45.248+0000 7f5363c2c080 0 osd.74 976 crush map has features 288514051259236352, adjusting msgr requires for clients Oct 18 16:17:45 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:45.248+0000 7f5363c2c080 0 osd.74 976 crush map has features 288514051259236352 was 8705, adjusting msgr requires for mons Oct 18 16:17:45 <OSD_HOST_02> conmon[3587539]: debug Oct 18 16:17:45 <OSD_HOST_02> conmon[3587539]: 2021-10-18T16:17:45.248+0000 7f5363c2c080 0 osd.74 976 crush map has features 3314933000852226048, adjusting msgr requires for osds Oct 18 16:17:45 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:45.248+0000 7f5363c2c080 1 osd.74 976 check_osdmap_features require_osd_release unknown -> pacific Oct 18 16:17:45 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:45.264+0000 7f5363c2c080 0 osd.74 976 load_pgs Oct 18 16:17:45 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:45.264+0000 7f5363c2c080 0 osd.74 976 load_pgs opened 1 pgs Oct 18 16:17:45 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:45.268+0000 7f5363c2c080 -1 osd.74 976 log_to_monitors {default=true} Oct 18 16:17:45 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:45.688+0000 7f5363c2c080 0 osd.74 976 done with init, starting boot process Oct 18 16:17:45 <OSD_HOST_02> conmon[3587539]: debug Oct 18 16:17:45 <OSD_HOST_02> conmon[3587539]: 2021-10-18T16:17:45.688+0000 7f5363c2c080 1 osd.74 976 start_boot Oct 18 16:17:45 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:45.688+0000 7f53412e0700 1 osd.74 pg_epoch: 976 pg[1.0( empty local-lis/les=0/0 n=0 ec=149/149 lis/c=0/0 les/c/f=0/0/0 sis=975) [74,0] r=0 lpr=975 pi=[149,975)/7 crt=0'0 m> Oct 18 16:17:45 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:45.692+0000 7f5356b50700 -1 osd.74 976 set_numa_affinity unable to identify public interface '' numa node: (2) No such file or directory Oct 18 16:17:45 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:45.692+0000 7f5356b50700 1 osd.74 976 set_numa_affinity not setting numa affinity Oct 18 16:17:46 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:46.288+0000 7f5358353700 1 osd.74 976 tick checking mon for new map Oct 18 16:17:46 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:46.300+0000 7f53412e0700 1 osd.74 pg_epoch: 977 pg[1.0( empty local-lis/les=0/0 n=0 ec=149/149 lis/c=0/0 les/c/f=0/0/0 sis=975) [74,0] r=0 lpr=975 pi=[149,975)/7 crt=0'0 m> Oct 18 16:17:46 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:46.300+0000 7f53412e0700 1 osd.74 pg_epoch: 977 pg[1.0( empty local-lis/les=0/0 n=0 ec=149/149 lis/c=0/0 les/c/f=0/0/0 sis=977) [0] r=-1 lpr=977 pi=[149,977)/7 crt=0'0 mlc> Oct 18 16:17:46 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:46.300+0000 7f53412e0700 1 osd.74 pg_epoch: 979 pg[1.0( empty local-lis/les=0/0 n=0 ec=149/149 lis/c=0/0 les/c/f=0/0/0 sis=977) [0] r=-1 lpr=977 pi=[149,977)/7 crt=0'0 mlc> Oct 18 16:17:46 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:46.724+0000 7f534e130700 1 osd.74 980 state: booting -> active Oct 18 16:17:46 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:46.724+0000 7f53412e0700 1 osd.74 pg_epoch: 980 pg[1.0( empty local-lis/les=0/0 n=0 ec=149/149 lis/c=0/0 les/c/f=0/0/0 sis=980) [74,0] r=0 lpr=980 pi=[149,980)/7 crt=0'0 m> Oct 18 16:17:46 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:46.724+0000 7f53412e0700 1 osd.74 pg_epoch: 980 pg[1.0( empty local-lis/les=0/0 n=0 ec=149/149 lis/c=0/0 les/c/f=0/0/0 sis=980) [74,0] r=0 lpr=980 pi=[149,980)/7 crt=0'0 m> Oct 18 16:17:46 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:46.724+0000 7f535cbd9700 -1 --2- <CLUSTER_IP_NODE1>:0/3582903554 >> [v2:<CLUSTER_IP_NODE1>:7275/2411802373,v1:<CLUSTER_IP_NODE1>:7279/2411802373] conn(0x55e9db0dd800 0x55e9db1c4000 unknown :-1 > Oct 18 16:17:46 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:46.724+0000 7f535d3da700 -1 --2- <CLUSTER_IP_NODE1>:0/3582903554 >> [v2:<CLUSTER_IP_NODE1>:6917/4091393816,v1:<CLUSTER_IP_NODE1>:6922/4091393816] conn(0x55e9db1ca000 0x55e9db1c4a00 unknown :-1 > Oct 18 16:17:46 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:46.724+0000 7f535d3da700 -1 --2- <CLUSTER_IP_NODE1>:0/3582903554 >> [v2:<CLUSTER_IP_NODE1>:6970/4027233516,v1:<CLUSTER_IP_NODE1>:6976/4027233516] conn(0x55e9db1cb800 0x55e9db1c6d00 unknown :-1 > Oct 18 16:17:46 <OSD_HOST_02> conmon[3587539]: debug 2021-10-18T16:17:46.724+0000 7f535cbd9700 -1 --2- <CLUSTER_IP_NODE1>:0/3582903554 >> [v2:<CLUSTER_IP_NODE1>:7112/1368237159,v1:<CLUSTER_IP_NODE1>:7115/1368237159] conn(0x55e9db15c800 0x55e9db162500 unknown :-1 > Thanks, _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx