Am 22.08.18 um 07:17 schrieb Anthony D'Atri: > >> I'm posting on ceph-devel because I didn't get any feedback on >> ceph-users. This is an act of desperation… > > I keep thinking “I’ll catch up on ceph-users tomorrow”, then I realize that at the moment I have 208 digests in my inbox. I muse at times that a different format or division might address the volume, but have yet to come up with a strategy that doesn’t cause at least as many problems as it solves. Enforcing a strict limit on message size might do wonders. That said ... > >> TL;DR: Cluster runs good with Kernel 4.13, produces slow_requests with >> Kernel 4.15. How to debug? > > This is something of a longshot, but is it possible that your 4.13 kernel tree has updated vendor drivers overlaid, but your 4.15 tree doesn’t? > > I don't think so. Both 4.13 and 4.15 show v1.5.3-1.534 of the myri10ge driver and even the firmware versions (boot and runtime) on the cards are identical on all hosts. Output below. Comparing the source code of the myri10ge driver provided by 4.13 and 4.15 show a 4 line change regarding a time. I cannot assess if this has anything to do with the issue. I'm now following Sage's advice to check MTU and LACP settings. The Myricom cards irritate a litte as they seem to have a default MTU of 9000; at least this is what the networking stack configures when no MTU is provided in the configuration file. #### kernel 4.13 #### # dmesg | grep myri10ge [ 3.952816] myri10ge: Version 1.5.3-1.534 [ 4.080169] myri10ge 0000:05:00.0: Not enabling ECRC on non-root port 0000:04:00.0 [ 4.404676] myri10ge 0000:05:00.0: MSI IRQ 26, tx bndry 4096, fw myri10ge_eth_z8e.dat, MTRR Disabled, WC Enabled [ 4.532094] myri10ge 0000:07:00.0: Not enabling ECRC on non-root port 0000:04:05.0 [ 4.864582] myri10ge 0000:07:00.0: MSI IRQ 28, tx bndry 4096, fw myri10ge_eth_z8e.dat, MTRR Disabled, WC Enabled [ 4.879655] myri10ge 0000:05:00.0 rename3: renamed from eth1 [ 5.024356] myri10ge 0000:07:00.0 eth3: renamed from eth2 [ 5.100279] myri10ge 0000:05:00.0 eth2: renamed from rename3 [ 19.134238] myri10ge 0000:05:00.0 eth2: link down [ 19.142433] myri10ge 0000:05:00.0 eth2: link up [ 19.510470] myri10ge 0000:07:00.0 eth3: link down [ 19.986032] myri10ge 0000:05:00.0 eth2: changing mtu from 9000 to 1500 [ 20.322838] myri10ge 0000:05:00.0 eth2: link down [ 20.331032] myri10ge 0000:05:00.0 eth2: link up # modinfo myri10ge filename: /lib/modules/4.13.16-4-pve/kernel/drivers/net/ethernet/myricom/myri10ge/myri10ge.ko firmware: myri10ge_rss_eth_z8e.dat firmware: myri10ge_rss_ethp_z8e.dat firmware: myri10ge_eth_z8e.dat firmware: myri10ge_ethp_z8e.dat license: Dual BSD/GPL version: 1.5.3-1.534 author: Maintainer: help@xxxxxxxx description: Myricom 10G driver (10GbE) srcversion: 6D094BCAD6D5C81C0789E39 alias: pci:v000014C1d00000009sv*sd*bc*sc*i* alias: pci:v000014C1d00000008sv*sd*bc*sc*i* depends: dca retpoline: Y intree: Y name: myri10ge vermagic: 4.13.16-4-pve SMP mod_unload modversions parm: myri10ge_fw_name:Firmware image name (charp) parm: myri10ge_fw_names:Firmware image names per board (array of charp) parm: myri10ge_ecrc_enable:Enable Extended CRC on PCI-E (int) parm: myri10ge_small_bytes:Threshold of small packets (int) parm: myri10ge_msi:Enable Message Signalled Interrupts (int) parm: myri10ge_intr_coal_delay:Interrupt coalescing delay (int) parm: myri10ge_flow_control:Pause parameter (int) parm: myri10ge_deassert_wait:Wait when deasserting legacy interrupts (int) parm: myri10ge_force_firmware:Force firmware to assume aligned completions (int) parm: myri10ge_initial_mtu:Initial MTU (int) parm: myri10ge_napi_weight:Set NAPI weight (int) parm: myri10ge_watchdog_timeout:Set watchdog timeout (int) parm: myri10ge_max_irq_loops:Set stuck legacy IRQ detection threshold (int) parm: myri10ge_debug:Debug level (0=none,...,16=all) (int) parm: myri10ge_fill_thresh:Number of empty rx slots allowed (int) parm: myri10ge_max_slices:Max tx/rx queues (int) parm: myri10ge_rss_hash:Type of RSS hashing to do (int) parm: myri10ge_dca:Enable DCA if possible (int) # ./myri-tools-1.28/bin/myri_info pci-dev # ./myri-tools-1.28/bin/myri_info pci-dev at 05:00.0 vendor:product(rev)=14c1:0008(01) behind bridge downstream-port: 04:00.0 10b5:8624 (x8.1/x8.2) behind bridge upstream-port: 03:00.0 10b5:8624 (x8.2/x8.2) behind bridge root-port: 00:07.0 8086:340e (x8.2/x8.2) Myri-10G-PCIE-8B -- Link x8 EEPROM String-spec: MAC=00:60:dd:43:28:7a SN=497051 PWR=100 PC=10G-PCIE2-8C2-2S PN=09-04477 MEMMAP=---------------- PSERDES=0 TAG=ze_tools-1_4_53a EEPROM MCP, PRESENT, length = 106980, crc=0xafd35588 ETHZ::1.4.59 2014/06/27 21:42:52 self extracting firmware Simple-bundle: exec_len = 106976 Running MCP: ETH ::1.4.57 -- 2013/10/23 13:58:51 myri10ge firmware pci-dev at 07:00.0 vendor:product(rev)=14c1:0008(01) behind bridge downstream-port: 04:05.0 10b5:8624 (x8.1/x8.2) behind bridge upstream-port: 03:00.0 10b5:8624 (x8.2/x8.2) behind bridge root-port: 00:07.0 8086:340e (x8.2/x8.2) Myri-10G-PCIE-8B -- Link x8 EEPROM String-spec: MAC=00:60:dd:43:28:7b SN=497051 PWR=100 PC=10G-PCIE2-8C2-2S PN=09-04477 MEMMAP=---------------- PSERDES=0 TAG=ze_tools-1_4_53a EEPROM MCP, PRESENT, length = 106980, crc=0xafd35588 ETHZ::1.4.59 2014/06/27 21:42:52 self extracting firmware Simple-bundle: exec_len = 106976 Running MCP: ETH ::1.4.57 -- 2013/10/23 13:58:51 myri10ge firmwareat 05:00.0 vendor:product(rev)=14c1:0008(01) behind bridge downstream-port: 04:00.0 10b5:8624 (x8.1/x8.2) behind bridge upstream-port: 03:00.0 10b5:8624 (x8.2/x8.2) behind bridge root-port: 00:07.0 8086:340e (x8.2/x8.2) Myri-10G-PCIE-8B -- Link x8 EEPROM String-spec: MAC=00:60:dd:43:28:7a SN=497051 PWR=100 PC=10G-PCIE2-8C2-2S PN=09-04477 MEMMAP=---------------- PSERDES=0 TAG=ze_tools-1_4_53a EEPROM MCP, PRESENT, length = 106980, crc=0xafd35588 ETHZ::1.4.59 2014/06/27 21:42:52 self extracting firmware Simple-bundle: exec_len = 106976 Running MCP: ETH ::1.4.57 -- 2013/10/23 13:58:51 myri10ge firmware pci-dev at 07:00.0 vendor:product(rev)=14c1:0008(01) behind bridge downstream-port: 04:05.0 10b5:8624 (x8.1/x8.2) behind bridge upstream-port: 03:00.0 10b5:8624 (x8.2/x8.2) behind bridge root-port: 00:07.0 8086:340e (x8.2/x8.2) Myri-10G-PCIE-8B -- Link x8 EEPROM String-spec: MAC=00:60:dd:43:28:7b SN=497051 PWR=100 PC=10G-PCIE2-8C2-2S PN=09-04477 MEMMAP=---------------- PSERDES=0 TAG=ze_tools-1_4_53a EEPROM MCP, PRESENT, length = 106980, crc=0xafd35588 ETHZ::1.4.59 2014/06/27 21:42:52 self extracting firmware Simple-bundle: exec_len = 106976 Running MCP: ETH ::1.4.57 -- 2013/10/2 #### kernel 4.15 #### # dmesg | grep myri10ge [ 4.406352] myri10ge: Version 1.5.3-1.534 [ 4.568020] myri10ge 0000:05:00.0: Not enabling ECRC on non-root port 0000:04:00.0 [ 4.936692] myri10ge 0000:05:00.0: MSI IRQ 26, tx bndry 4096, fw myri10ge_eth_z8e.dat, MTRR Disabled, WC Enabled [ 5.056080] myri10ge 0000:07:00.0: Not enabling ECRC on non-root port 0000:04:05.0 [ 5.424728] myri10ge 0000:07:00.0: MSI IRQ 28, tx bndry 4096, fw myri10ge_eth_z8e.dat, MTRR Disabled, WC Enabled [ 5.438021] myri10ge 0000:05:00.0 rename3: renamed from eth1 [ 5.480345] myri10ge 0000:07:00.0 eth3: renamed from eth2 [ 5.530456] myri10ge 0000:05:00.0 eth2: renamed from rename3 [ 13.206563] myri10ge 0000:05:00.0 eth2: link down [ 13.214742] myri10ge 0000:05:00.0 eth2: link up [ 19.761918] myri10ge 0000:05:00.0 eth2: changing mtu from 9000 to 1500 [ 20.067762] myri10ge 0000:05:00.0 eth2: link down [ 20.075970] myri10ge 0000:05:00.0 eth2: link up # modinfo myri10ge filename: /lib/modules/4.15.18-2-pve/kernel/drivers/net/ethernet/myricom/myri10ge/myri10ge.ko firmware: myri10ge_rss_eth_z8e.dat firmware: myri10ge_rss_ethp_z8e.dat firmware: myri10ge_eth_z8e.dat firmware: myri10ge_ethp_z8e.dat license: Dual BSD/GPL version: 1.5.3-1.534 author: Maintainer: help@xxxxxxxx description: Myricom 10G driver (10GbE) srcversion: 46526E4E4E82667CBFF2D7C alias: pci:v000014C1d00000009sv*sd*bc*sc*i* alias: pci:v000014C1d00000008sv*sd*bc*sc*i* depends: dca retpoline: Y intree: Y name: myri10ge vermagic: 4.15.18-2-pve SMP mod_unload modversions parm: myri10ge_fw_name:Firmware image name (charp) parm: myri10ge_fw_names:Firmware image names per board (array of charp) parm: myri10ge_ecrc_enable:Enable Extended CRC on PCI-E (int) parm: myri10ge_small_bytes:Threshold of small packets (int) parm: myri10ge_msi:Enable Message Signalled Interrupts (int) parm: myri10ge_intr_coal_delay:Interrupt coalescing delay (int) parm: myri10ge_flow_control:Pause parameter (int) parm: myri10ge_deassert_wait:Wait when deasserting legacy interrupts (int) parm: myri10ge_force_firmware:Force firmware to assume aligned completions (int) parm: myri10ge_initial_mtu:Initial MTU (int) parm: myri10ge_napi_weight:Set NAPI weight (int) parm: myri10ge_watchdog_timeout:Set watchdog timeout (int) parm: myri10ge_max_irq_loops:Set stuck legacy IRQ detection threshold (int) parm: myri10ge_debug:Debug level (0=none,...,16=all) (int) parm: myri10ge_fill_thresh:Number of empty rx slots allowed (int) parm: myri10ge_max_slices:Max tx/rx queues (int) parm: myri10ge_rss_hash:Type of RSS hashing to do (int) parm: myri10ge_dca:Enable DCA if possible (int) # ./myri-tools-1.28/bin/myri_info pci-dev at 05:00.0 vendor:product(rev)=14c1:0008(01) behind bridge downstream-port: 04:00.0 10b5:8624 (x8.1/x8.2) behind bridge upstream-port: 03:00.0 10b5:8624 (x8.2/x8.2) behind bridge root-port: 00:07.0 8086:340e (x8.2/x8.2) Myri-10G-PCIE-8B -- Link x8 EEPROM String-spec: MAC=00:60:dd:43:28:7a SN=497051 PWR=100 PC=10G-PCIE2-8C2-2S PN=09-04477 MEMMAP=---------------- PSERDES=0 TAG=ze_tools-1_4_53a EEPROM MCP, PRESENT, length = 106980, crc=0xafd35588 ETHZ::1.4.59 2014/06/27 21:42:52 self extracting firmware Simple-bundle: exec_len = 106976 Running MCP: ETH ::1.4.57 -- 2013/10/23 13:58:51 myri10ge firmware pci-dev at 07:00.0 vendor:product(rev)=14c1:0008(01) behind bridge downstream-port: 04:05.0 10b5:8624 (x8.1/x8.2) behind bridge upstream-port: 03:00.0 10b5:8624 (x8.2/x8.2) behind bridge root-port: 00:07.0 8086:340e (x8.2/x8.2) Myri-10G-PCIE-8B -- Link x8 EEPROM String-spec: MAC=00:60:dd:43:28:7b SN=497051 PWR=100 PC=10G-PCIE2-8C2-2S PN=09-04477 MEMMAP=---------------- PSERDES=0 TAG=ze_tools-1_4_53a EEPROM MCP, PRESENT, length = 106980, crc=0xafd35588 ETHZ::1.4.59 2014/06/27 21:42:52 self extracting firmware Simple-bundle: exec_len = 106976 Running MCP: ETH ::1.4.57 -- 2013/10/23 13:58:51 myri10ge firmware