Re: bad crc/signature errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ilya,

Yes, there are these messages of the "libceph: read_partial_message

 =================================


[715907.891171] libceph: read_partial_message ffff88033755a300 data crc 4149769120 != exp. 2349968434 [715907.892163] libceph: read_partial_message ffff88033755b000 data crc 2455195536 != exp. 2750456034
[715907.892167] libceph: osd17 10.255.0.9:6800 bad crc/signature
[715907.893807] libceph: osd16 10.255.0.8:6816 bad crc/signature
[715907.896219] libceph: read_partial_message ffff8803d8484400 data crc 455708272 != exp. 1414757638
[715907.897442] libceph: osd27 10.255.0.11:6820 bad crc/signature
[715938.129539] xen-blkback: backend/vbd/3952/768: prepare for reconnect
[715938.470670] libceph: read_partial_message ffff88030fb89600 data crc 1569919842 != exp. 3397794567 [715938.470711] libceph: read_partial_message ffff88017ffeb300 data crc 3909314762 != exp. 2254973565
[715938.470715] libceph: osd5 10.255.0.6:6812 bad crc/signature
[715938.471898] libceph: osd25 10.255.0.11:6800 bad crc/signature
[715938.473788] libceph: read_partial_message ffff88017ffeb300 data crc 682925087 != exp. 2254973565 [715938.474214] libceph: read_partial_message ffff88030fb89600 data crc 3941482587 != exp. 3397794567
[715938.474217] libceph: osd25 10.255.0.11:6800 bad crc/signature
[715938.475026] libceph: osd5 10.255.0.6:6812 bad crc/signature


On 05-10-2017 15:17, Ilya Dryomov wrote:
On Thu, Oct 5, 2017 at 7:53 AM, Adrian Saul
<Adrian.Saul@xxxxxxxxxxxxxxxxx> wrote:
We see the same messages and are similarly on a 4.4 KRBD version that is affected by this.

I have seen no impact from it so far that I know about


-----Original Message-----
From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of
Jason Dillaman
Sent: Thursday, 5 October 2017 5:45 AM
To: Gregory Farnum <gfarnum@xxxxxxxxxx>
Cc: ceph-users <ceph-users@xxxxxxxxxxxxxx>; Josy
<josy@xxxxxxxxxxxxxxxxxxxxx>
Subject: Re:  bad crc/signature errors

Perhaps this is related to a known issue on some 4.4 and later kernels [1]
where the stable write flag was not preserved by the kernel?

[1] http://tracker.ceph.com/issues/19275
The stable pages bug manifests as multiple sporadic connection resets,
because in that case CRCs computed by the kernel don't always match the
data that gets sent out.  When the mismatch is detected on the OSD
side, OSDs reset the connection and you'd see messages like

   libceph: osd1 1.2.3.4:6800 socket closed (con state OPEN)
   libceph: osd2 1.2.3.4:6804 socket error on write

This is a different issue.  Josy, Adrian, Olivier, do you also see
messages of the "libceph: read_partial_message ..." type or is it just
"libceph: ... bad crc/signature" errors?

Thanks,

                 Ilya


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux