CRC Bad Signature when using KRBD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Ceph Users,

* Problem: we get the following errors when using krbd, we are using rbd for vms.
* Workaround: by switching to librbd the errors disappear.

* Software:
** Kernel: 6.8.8-2 (parameters: intel_iommu=on iommu=pt pcie_aspm.policy=performance)
** Ceph: 18.2.2

Description/Details: Errors from using krbd with ceph. Side-effects:

[Wed Aug 21 03:04:17 2024] libceph: read_partial_message 0000000015af2284 data crc 1221767919 != exp. 282251377 [Wed Aug 21 03:04:17 2024] libceph: read_partial_message 0000000066b200ab data crc 3817026135 != exp. 3925662391 [Wed Aug 21 03:04:17 2024] libceph: osd15 (1)10.1.4.13:6836 bad crc/signature [Wed Aug 21 03:04:17 2024] libceph: osd13 (1)10.1.4.13:6809 bad crc/signature [Wed Aug 21 03:04:21 2024] libceph: read_partial_message 000000008a131738 data crc 2612835980 != exp. 917302924 [Wed Aug 21 03:04:21 2024] libceph: read_partial_message 000000005160776b data crc 2965872045 != exp. 563323792 [Wed Aug 21 03:04:21 2024] libceph: osd15 (1)10.1.4.13:6836 bad crc/signature [Wed Aug 21 03:04:21 2024] libceph: osd6 (1)10.1.4.12:6843 bad crc/signature [Wed Aug 21 03:06:44 2024] libceph: read_partial_message 000000007e548354 data crc 1265032637 != exp. 2426281931 [Wed Aug 21 03:06:44 2024] libceph: osd0 (1)10.1.4.11:6835 bad crc/signature [Wed Aug 21 03:06:44 2024] libceph: read_partial_message 000000009214d802 data crc 2596010853 != exp. 1221875667 [Wed Aug 21 03:06:44 2024] libceph: osd10 (1)10.1.4.12:6809 bad crc/signature [Wed Aug 21 03:06:47 2024] libceph: read_partial_message 000000000f9edc73 data crc 1326019705 != exp. 3079604517 [Wed Aug 21 03:06:47 2024] libceph: osd3 (1)10.1.4.11:6803 bad crc/signature [Wed Aug 21 03:06:50 2024] libceph: read_partial_message 000000004769da61 data crc 3421275194 != exp. 4183754554 [Wed Aug 21 03:06:50 2024] libceph: osd8 (1)10.1.4.12:6835 bad crc/signature [Wed Aug 21 03:06:51 2024] libceph: read_partial_message 0000000044db9a59 data crc 2603270708 != exp. 4150529351 [Wed Aug 21 03:06:51 2024] libceph: osd14 (1)10.1.4.13:6806 bad crc/signature

Description/Details 2: vms get problems with buffer i/o errors on rbd-backed virtual disks:

Aug 24 03:16:01 de-vlix-dbix-01 kernel: sd 0:0:0:1: [sdb] tag#211 timing out command, waited 180s Aug 24 03:16:01 de-vlix-dbix-01 kernel: sd 0:0:0:1: [sdb] tag#39 timing out command, waited 180s Aug 24 03:16:01 de-vlix-dbix-01 kernel: sd 0:0:0:1: [sdb] tag#211 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=885s Aug 24 03:16:01 de-vlix-dbix-01 kernel: sd 0:0:0:1: [sdb] tag#39 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=847s Aug 24 03:16:01 de-vlix-dbix-01 kernel: sd 0:0:0:1: [sdb] tag#211 Sense Key: Aborted Command [current] Aug 24 03:16:01 de-vlix-dbix-01 kernel: sd 0:0:0:1: [sdb] tag#211 Add. Sense: I/O process terminated Aug 24 03:16:01 de-vlix-dbix-01 kernel: sd 0:0:0:1: [sdb] tag#39 Sense Key: Aborted Command [current] Aug 24 03:16:01 de-vlix-dbix-01 kernel: sd 0:0:0:1: [sdb] tag#39 Add. Sense: I/0 process terminated Aug 24 03:16:01 de-vlix-dbix-01 kernel: sd 0:0:0:1: [sdb] tag#211 CDB: Write(10) 2a 00 34 87 48 08 00 00 08 00 Aug 24 03:16:01 de-vlix-dbix-01 kernel: sd 0:0:0:1: [sdb] tag#39 CDB: Write (10) 2a 00 34 81 52 70 00 00 58 00 Aug 24 03:16:01 de-vlix-dbix-01 kernel: I/O error, dev sdb, sector 881281032 op 0x1: (WRITE) flags 0x800 phys_seg 1 prio class 0 Aug 24 03:16:01 de-vlix-dbix-01 kernel: I/O error, dev sdb, sector 880890480 op 0x1: (WRITE) flags 0x103000 phys_seg 11 prio class 0 Aug 24 03:16:01 de-vlix-dbix-01 kernel: EXT4-fs warning (device sdb1): ext4_end_bio:343: I/0 error 10 writing to inode 27525908 starting Aug 24 03:16:01 de-vlix-dbix-01 kernel: Buffer I/O error on dev sdbl, logical block 110111054, lost async page write Aug 24 03:16:01 de-vlix-dbix-01 kernel: sd 0:0:0:1: [sdb] tag#212 timing out command, waited 180s Aug 24 03:16:01 de-vlix-dbix-01 kernel: buffer_io_error: 21 callbacks suppressed Aug 24 03:16:01 de-vlix-dbix-01 kernel: Buffer 1/0 error on device sdbl, logical block 110159873 Aug 24 03:16:01 de-vlix-dbix-01 kernel: Buffer 1/0 error on dev sdbl, logical block 110111055, lost async page write Aug 24 03:16:01 de-vlix-dbix-01 kernel: sd 0:0:0:1: [sdb] tag#212 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=875s Aug 24 03:16:01 de-vlix-dbix-01 kernel: Buffer I/O error on dev sdb1, logical block 110111056, lost async page write Aug 24 03:16:01 de-vlix-dbix-01 kernel: sd 0:0:0:1: [sdb] tag#212 Sense Key: Aborted Command [current] Aug 24 03:16:01 de-vlix-dbix-01 kernel: Buffer I/O error on dev sdbl, logical block 110111057, lost async page write Aug 24 03:16:01 de-vlix-dbix-01 kernel: sd 0:0:0:1: [sdb] tag#212 Add. Sense: I/0 process terminated Aug 24 03:16:01 de-vlix-dbix-01 kernel: Buffer I/O error on dev sdb1, logical block 110111058, lost async page write Aug 24 03:16:01 de-vlix-dbix-01 kernel: sd 0:0:0:1: [sdb] tag#212 CDB: Write(10) 2a 00 34 87 48 10 00 00 08 00 Aug 24 03:16:01 de-vlix-dbix-01 kernel: Buffer I/O error on dev sdbl, logical block 110111059, lost async page write Aug 24 03:16:01 de-vlix-dbix-01 kernel: I/O error, dev sdb, sector 881281040 op 0x1: (WRITE) flags 0x800 phys_seg 1 prio class 0 Aug 24 03:16:01 de-vlix-dbix-01 kernel: Buffer I/O error on dev sdbl, logical block 110111060, lost async page write

Thanks for helping out.

Greetings Jonas
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux