Le jeudi 05 octobre 2017 à 11:47 +0200, Ilya Dryomov a écrit : > The stable pages bug manifests as multiple sporadic connection > resets, > because in that case CRCs computed by the kernel don't always match > the > data that gets sent out. When the mismatch is detected on the OSD > side, OSDs reset the connection and you'd see messages like > > libceph: osd1 1.2.3.4:6800 socket closed (con state OPEN) > libceph: osd2 1.2.3.4:6804 socket error on write > > This is a different issue. Josy, Adrian, Olivier, do you also see > messages of the "libceph: read_partial_message ..." type or is it > just > "libceph: ... bad crc/signature" errors? > > Thanks, > > Ilya I have "read_partial_message" too, for example : Oct 5 09:00:47 lorunde kernel: [65575.969322] libceph: read_partial_message ffff88027c231500 data crc 181941039 != exp. 115232978 Oct 5 09:00:47 lorunde kernel: [65575.969953] libceph: osd122 10.0.0.31:6800 bad crc/signature Oct 5 09:04:30 lorunde kernel: [65798.958344] libceph: read_partial_message ffff880254a25c00 data crc 443114996 != exp. 2014723213 Oct 5 09:04:30 lorunde kernel: [65798.959044] libceph: osd18 10.0.0.22:6802 bad crc/signature Oct 5 09:14:28 lorunde kernel: [66396.788272] libceph: read_partial_message ffff880238636200 data crc 1797729588 != exp. 2550563968 Oct 5 09:14:28 lorunde kernel: [66396.788984] libceph: osd43 10.0.0.9:6804 bad crc/signature Oct 5 10:09:36 lorunde kernel: [69704.211672] libceph: read_partial_message ffff8802712dff00 data crc 2241944833 != exp. 762990605 Oct 5 10:09:36 lorunde kernel: [69704.212422] libceph: osd103 10.0.0.28:6804 bad crc/signature Oct 5 10:25:41 lorunde kernel: [70669.203596] libceph: read_partial_message ffff880257521400 data crc 3655331946 != exp. 2796991675 Oct 5 10:25:41 lorunde kernel: [70669.204462] libceph: osd16 10.0.0.21:6806 bad crc/signature Oct 5 10:25:52 lorunde kernel: [70680.255943] libceph: read_partial_message ffff880245e3d600 data crc 3787567693 != exp. 725251636 Oct 5 10:25:52 lorunde kernel: [70680.257066] libceph: osd60 10.0.0.23:6800 bad crc/signature On OSD side, for osd122 for example, I don't see any "reset" in osd log. Thanks, Olivier _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com