Re: rocksdb corruption, stale pg, rebuild bucket index

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Idea received from Wido den Hollander:
bluestore rocksdb options = "compaction_readahead_size=0"

With this option, I just tried to start 1 of the 3 crashing OSDs, and it came up! I did with "ceph osd set noin" for now.

Later it aborted:

2019-06-13 13:11:11.862 7f2a19f5f700 1 heartbeat_map reset_timeout 'OSD::osd_op_tp thread 0x7f2a19f5f700' had timed out after 15 2019-06-13 13:11:11.862 7f2a19f5f700 1 heartbeat_map reset_timeout 'OSD::osd_op_tp thread 0x7f2a19f5f700' had suicide timed out after 150 2019-06-13 13:11:11.862 7f2a37982700 0 --1- v1:[2001:620:5ca1:201::119]:6809/3426631 >> v1:[2001:620:5ca1:201::144]:6821/3627456 conn(0x564f65c0c000 0x564f26d6d800 :-1 s=CONNECTING_SEND_CONNECT_MSG pgs=18075 cs=1 l=0).handle_connect_reply_2 connect got RESETSESSION
2019-06-13 13:11:11.862 7f2a19f5f700 -1 *** Caught signal (Aborted) **
 in thread 7f2a19f5f700 thread_name:tp_osd_tp

ceph version 14.2.1 (d555a9489eb35f84f2e1ef49b77e19da9d113972) nautilus (stable)
 1: (()+0x12890) [0x7f2a3a818890]
 2: (pthread_kill()+0x31) [0x7f2a3a8152d1]
3: (ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d const*, char const*, unsigned long)+0x24b) [0x564d732ca2bb] 4: (ceph::HeartbeatMap::reset_timeout(ceph::heartbeat_handle_d*, unsigned long, unsigned long)+0x255) [0x564d732ca895] 5: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5a0) [0x564d732eb560]
 6: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x564d732ed5d0]
 7: (()+0x76db) [0x7f2a3a80d6db]
 8: (clone()+0x3f) [0x7f2a395ad88f]

I guess that this is because of pending backfilling and the noin flag? Afterwards it restarted by itself and came up. I stopped it again for now.

It looks healthy so far:
ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-266 fsck
fsck success

Now we have to choose how to continue, trying to reduce the risk of losing data (most bucket indexes are intact currently). My guess would be to let this OSD (which was not the primary) go in and hope that it recovers. In case of a problem, maybe we could still use the other OSDs "somehow"? In case of success, we would bring back the other OSDs as well?

OTOH we could try to continue with the key dump from earlier today.

Any opinions?

Thanks!
 Harry

On 13.06.19 09:32, Harald Staub wrote:
On 13.06.19 00:33, Sage Weil wrote:
[...]
One other thing to try before taking any drastic steps (as described
below):

  ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-NNN fsck

This gives: fsck success

and the large alloc warnings:

tcmalloc: large alloc 2145263616 bytes == 0x562412e10000 @ 0x7fed890d6887 0x562385370229 0x5623853703a3 0x5623856c51ec 0x56238566dce2 0x56238566fa05 0x562385681d41 0x562385476201 0x5623853d5737 0x5623853ef418 0x562385420ae1 0x5623852901c2 0x7fed7ddddb97 0x56238536977a tcmalloc: large alloc 4290519040 bytes == 0x562492bf2000 @ 0x7fed890d6887 0x562385370229 0x5623853703a3 0x5623856c51ec 0x56238566dce2 0x56238566fa05 0x562385681d41 0x562385476201 0x5623853d5737 0x5623853ef418 0x562385420ae1 0x5623852901c2 0x7fed7ddddb97 0x56238536977a tcmalloc: large alloc 8581029888 bytes == 0x562593068000 @ 0x7fed890d6887 0x562385370229 0x5623853703a3 0x5623856c51ec 0x56238566dce2 0x56238566fa05 0x562385681d41 0x562385476201 0x5623853d5737 0x5623853ef418 0x562385420ae1 0x5623852901c2 0x7fed7ddddb97 0x56238536977a tcmalloc: large alloc 17162051584 bytes == 0x562792fea000 @ 0x7fed890d6887 0x562385370229 0x5623853703a3 0x5623856c51ec 0x56238566dce2 0x56238566fa05 0x562385681d41 0x562385476201 0x5623853d5737 0x5623853ef418 0x562385420ae1 0x5623852901c2 0x7fed7ddddb97 0x56238536977a tcmalloc: large alloc 13559291904 bytes == 0x562b92eec000 @ 0x7fed890d6887 0x562385370229 0x56238537181b 0x562385723a99 0x56238566dd25 0x56238566fa05 0x562385681d41 0x562385476201 0x5623853d5737 0x5623853ef418 0x562385420ae1 0x5623852901c2 0x7fed7ddddb97 0x56238536977a

Thanks!
  Harry

[...]
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux