Re: oops in rbd module (con_work in libceph)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Le 09/07/2012 18:54, Yann Dupont a écrit :

Ok. I've compiled the kernel this afternoon, and tested it without much success :

Jul 9 18:17:23 label5.u14.univ-nantes.prive kernel: [ 284.116236] libceph: osd0 172.20.14.130:6801 socket closed Jul 9 18:17:43 label5.u14.univ-nantes.prive kernel: [ 304.101545] libceph: osd6 172.20.14.137:6800 socket closed Jul 9 18:17:53 label5.u14.univ-nantes.prive kernel: [ 314.095155] libceph: osd3 172.20.14.134:6800 socket closed Jul 9 18:18:38 label5.u14.univ-nantes.prive kernel: [ 359.075473] libceph: osd5 172.20.14.136:6800 socket closed Jul 9 18:19:48 label5.u14.univ-nantes.prive kernel: [ 429.107334] libceph: osd6 172.20.14.137:6800 socket closed

just an interesting thing I just noticed in the logs :

osd-0.log
2012-07-09 18:17:23.763925 7ff9fc19e700 0 bad crc in data 3071411075 != exp 2231697357 2012-07-09 18:17:23.777607 7ff9fc19e700 0 -- 172.20.14.130:6801/5842 >> 172.20.14.132:0/1974511416 pipe(0x2236c80 sd=38 pgs=0 cs=0 l=0).accept peer addr is really 172.20.14.132:0/1974511416 (socket is 172.20.14.132:57972/0)

osd-3.log
2012-07-09 18:17:53.770111 7fe35461c700 0 bad crc in data 826922774 != exp 2498450653 2012-07-09 18:17:53.770972 7fe35461c700 0 -- 172.20.14.134:6800/4495 >> 172.20.14.132:0/1974511416 pipe(0xa44ec80 sd=56 pgs=0 cs=0 l=0).accept peer addr is really 172.20.14.132:0/1974511416 (socket
 is 172.20.14.132:40726/0)

osd-5.log
2012-07-09 18:18:38.766417 7ff4a66cb700 0 bad crc in data 3949121728 != exp 2496058560 2012-07-09 18:18:38.773386 7ff4a66cb700 0 -- 172.20.14.136:6800/4876 >> 172.20.14.132:0/1974511416 pipe(0x20eeb780 sd=56 pgs=0 cs=0 l=0).accept peer addr is really 172.20.14.132:0/1974511416 (socket is 172.20.14.132:57072/0)

osd-6.log
2012-07-09 18:17:43.765740 7fdf86b9d700 0 bad crc in data 2899452345 != exp 2656886014 2012-07-09 18:17:43.772599 7fdf86b9d700 0 -- 172.20.14.137:6800/5260 >> 172.20.14.132:0/1974511416 pipe(0x1ec64780 sd=31 pgs=0 cs=0 l=0).accept peer addr is really 172.20.14.132:0/1974511416 (socke
t is 172.20.14.132:48615/0)

2012-07-09 18:17:43.773170 7fdf8c718700 0 osd.6 347 pg[2.60( v 347'36181 (337'35180,347'36181] n=4 144 ec=1 les/c 6/6 5/5/5) [6,7] r=0 lpr=5 mlcod 347'36180 active+clean] watch: ctx->obc=0x102db340
cookie=1 oi.version=36169 ctx->at_version=347'36182
2012-07-09 18:17:43.773209 7fdf8c718700 0 osd.6 347 pg[2.60( v 347'36181 (337'35180,347'36181] n=4144 ec=1 les/c 6/6 5/5/5) [6,7] r=0 lpr=5 mlcod 347'36180 active+clean] watch: oi.user_version=1559 2012-07-09 18:19:48.837952 7fdf86b9d700 0 bad crc in data 1231964953 != exp 2305533436 2012-07-09 18:19:48.838850 7fdf86b9d700 0 -- 172.20.14.137:6800/5260 >> 172.20.14.132:0/1974511416 pipe(0x1ec64c80 sd=31 pgs=0 cs=0 l=0).accept peer addr is really 172.20.14.132:0/1974511416 (socket is 172.20.14.132:48618/0) 2012-07-09 18:19:48.839493 7fdf8c718700 0 osd.6 347 pg[2.60( v 347'36192 (337'35191,347'36192] n=4144 ec=1 les/c 6/6 5/5/5) [6,7] r=0 lpr=5 mlcod 347'36191 active+clean] watch: ctx->obc=0x102db340 cookie=1 oi.version=36169 ctx->at_version=347'36193 2012-07-09 18:19:48.839530 7fdf8c718700 0 osd.6 347 pg[2.60( v 347'36192 (337'35191,347'36192] n=4144 ec=1 les/c 6/6 5/5/5) [6,7] r=0 lpr=5 mlcod 347'36191 active+clean] watch: oi.user_version=1559


Each time, at the exact date, a bad CRC (they are the only ones for this day, so it seems related)

Cheers,

--
Yann Dupont - Service IRTS, DSI Université de Nantes
Tel : 02.53.48.49.20 - Mail/Jabber : Yann.Dupont@xxxxxxxxxxxxxx

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux