Luminous 12.2.8, active+undersized+degraded+inconsistent

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,
I am running Ceph cluster on Luminous 12.2.8 with 36 OSD.
Today deep-scrub  has found error on PG 25.60 and later fail one of OSD.
Now PG 25.60  stuck in active+undersized+degraded+inconsistent state. I
cant repair it by ceph pg repair 25.60 – the repair process does not
start at all. What is the correct recovery process for this situation?


=== ceph health detail ===
HEALTH_ERR 1 osds down; 1 scrub errors; Possible data damage: 1 pg
inconsistent; Degraded data redundancy: 188063/5900718 objects degraded
(3.187%), 117 pgs degraded, 117 pgs undersized
OSD_DOWN 1 osds down
    osd.6 (root=default,host=hv203) is down
OSD_SCRUB_ERRORS 1 scrub errors
PG_DAMAGED Possible data damage: 1 pg inconsistent
    pg 25.60 is active+undersized+degraded+inconsistent, acting [25,4]


=== ceph.log ===
2019-04-26 04:01:35.129464 osd.25 osd.25 10.4.5.207:6800/2469060 167 :
cluster [ERR] 25.60 shard 6: soid
25:065e49e9:::rbd_data.3759266b8b4567.0000000000018202:head candidate
had a read error
2019-04-26 04:03:31.533671 osd.25 osd.25 10.4.5.207:6800/2469060 168 :
cluster [ERR] 25.60 deep-scrub 0 missing, 1 inconsistent objects
2019-04-26 04:03:31.533677 osd.25 osd.25 10.4.5.207:6800/2469060 169 :
cluster [ERR] 25.60 deep-scrub 1 errors


=== ceph-osd.6.log ===
2019-04-26 04:53:17.939436 7f6a8ae48700  4 rocksdb:
[/mnt/npool/a.antreich/ceph/ceph-12.2.8/src/rocksdb/db/compaction_job.cc:1403]
[default] [JOB 284] Compacting 4@0 + 4@1 files to L1, score 1.00
2019-04-26 04:53:17.939715 7f6a8ae48700  4 rocksdb:
[/mnt/npool/a.antreich/ceph/ceph-12.2.8/src/rocksdb/db/compaction_job.cc:1407]
[default] Compaction start summary: Base version 283 Base level 0,
inputs: [31929(25MB) 31927(21MB) 31925(22MB) 31923(26MB)], [31912(65MB)
31913(65MB) 31914(65MB) 31915(29MB)]

2019-04-26 04:53:17.939978 7f6a8ae48700  4 rocksdb: EVENT_LOG_v1
{"time_micros": 1556243597939747, "job": 284, "event":
"compaction_started", "files_L0": [31929, 31927, 31925, 31923],
"files_L1": [31912, 31913, 31914, 31915], "score": 1, "input_data_size":
339668148}
2019-04-26 04:53:21.500373 7f6a8ae48700  4 rocksdb:
[/mnt/npool/a.antreich/ceph/ceph-12.2.8/src/rocksdb/db/compaction_job.cc:1116]
[default] [JOB 284] Generated table #31930: 380678 keys, 69567323 bytes
2019-04-26 04:53:21.500410 7f6a8ae48700  4 rocksdb: EVENT_LOG_v1
{"time_micros": 1556243601500399, "cf_name": "default", "job": 284,
"event": "table_file_creation", "file_number": 31930, "file_size":
69567323, "table_properties": {"data_size": 67110779, "index_size":
1349659, "filter_size": 1105896, "raw_key_size": 22641147,
"raw_average_key_size": 59, "raw_value_size": 59452413,
"raw_average_value_size": 156, "num_data_blocks": 16601, "num_entries":
380678, "filter_policy_name": "rocksdb.BuiltinBloomFilter",
"kDeletedKeys": "161626", "kMergeOperands": "0"}}
2019-04-26 04:53:24.294928 7f6a8ae48700  4 rocksdb:
[/mnt/npool/a.antreich/ceph/ceph-12.2.8/src/rocksdb/db/compaction_job.cc:1116]
[default] [JOB 284] Generated table #31931: 118877 keys, 69059681 bytes
2019-04-26 04:53:24.294964 7f6a8ae48700  4 rocksdb: EVENT_LOG_v1
{"time_micros": 1556243604294950, "cf_name": "default", "job": 284,
"event": "table_file_creation", "file_number": 31931, "file_size":
69059681, "table_properties": {"data_size": 67109949, "index_size":
1495694, "filter_size": 453050, "raw_key_size": 10391245,
"raw_average_key_size": 87, "raw_value_size": 63028568,
"raw_average_value_size": 530, "num_data_blocks": 16621, "num_entries":
118877, "filter_policy_name": "rocksdb.BuiltinBloomFilter",
"kDeletedKeys": "1266", "kMergeOperands": "0"}}
2019-04-26 04:53:27.979518 7f6a8ae48700  4 rocksdb:
[/mnt/npool/a.antreich/ceph/ceph-12.2.8/src/rocksdb/db/compaction_job.cc:1116]
[default] [JOB 284] Generated table #31932: 119238 keys, 69066929 bytes
2019-04-26 04:53:27.979545 7f6a8ae48700  4 rocksdb: EVENT_LOG_v1
{"time_micros": 1556243607979532, "cf_name": "default", "job": 284,
"event": "table_file_creation", "file_number": 31932, "file_size":
69066929, "table_properties": {"data_size": 67112338, "index_size":
1499661, "filter_size": 453942, "raw_key_size": 10424324,
"raw_average_key_size": 87, "raw_value_size": 63036698,
"raw_average_value_size": 528, "num_data_blocks": 16599, "num_entries":
119238, "filter_policy_name": "rocksdb.BuiltinBloomFilter",
"kDeletedKeys": "3045", "kMergeOperands": "0"}}
2019-04-26 04:53:31.014387 7f6a8ae48700  3 rocksdb:
[/mnt/npool/a.antreich/ceph/ceph-12.2.8/src/rocksdb/db/db_impl_compaction_flush.cc:1591]
Compaction error: Corruption: block checksum mismatch
2019-04-26 04:53:31.014409 7f6a8ae48700  4 rocksdb: (Original Log Time
2019/04/26-04:53:31.012695)
[/mnt/npool/a.antreich/ceph/ceph-12.2.8/src/rocksdb/db/compaction_job.cc:621]
[default] compacted to: base level 1 max bytes base 268435456 files[4 4
16 0 0 0 0] max score 0.29, MB/sec: 26.0 rd, 21.2 wr, level 1, files
in(4, 4) out(4) MB in(97.3, 226.6) out(263.9), read-write-amplify(6.0)
write-amplify(2.7) Corruption: block checksum mismatch, records in:
975162, records dropped: 32804

2019-04-26 04:53:31.014413 7f6a8ae48700  4 rocksdb: (Original Log Time
2019/04/26-04:53:31.014231) EVENT_LOG_v1 {"time_micros":
1556243611012706, "job": 284, "event": "compaction_finished",
"compaction_time_micros": 13072480, "output_level": 1,
"num_output_files": 4, "total_output_size": 276762847,
"num_input_records": 788663, "num_output_records": 755859,
"num_subcompactions": 1, "num_single_delete_mismatches": 0,
"num_single_delete_fallthrough": 0, "lsm_state": [4, 4, 16, 0, 0, 0, 0]}
2019-04-26 04:53:31.014415 7f6a8ae48700  2 rocksdb:
[/mnt/npool/a.antreich/ceph/ceph-12.2.8/src/rocksdb/db/db_impl_compaction_flush.cc:1275]
Waiting after background compaction error: Corruption: block checksum
mismatch, Accumulated background error counts: 1
2019-04-26 04:53:31.143810 7f6a9ae68700 -1 rocksdb: submit_transaction
error: Corruption: block checksum mismatch code = 2 Rocksdb transaction:
Put( Prefix = M key =
0x0000000000097374'.0000018493.00000000000000927272' Value size = 184)
Put( Prefix = M key = 0x0000000000097374'._fastinfo' Value size = 186)
Put( Prefix = O key =
0x7f80000000000000190fae4189217262'd_data.25db7f6b8b4567.0000000000001f45!='0xfffffffffffffffeffffffffffffffff6f00000000'x'
Value size = 540)
Put( Prefix = O key =
0x7f80000000000000190fae4189217262'd_data.25db7f6b8b4567.0000000000001f45!='0xfffffffffffffffeffffffffffffffff'o'
Value size = 429)
Put( Prefix = L key = 0x00000000003cc72c Value size = 16423)
2019-04-26 04:53:31.152093 7f6a9ae68700 -1
/mnt/npool/a.antreich/ceph/ceph-12.2.8/src/os/bluestore/BlueStore.cc: In
function 'void BlueStore::_kv_sync_thread()' thread 7f6a9ae68700 time
2019-04-26 04:53:31.144381
/mnt/npool/a.antreich/ceph/ceph-12.2.8/src/os/bluestore/BlueStore.cc:
8537: FAILED assert(r == 0)

 ceph version 12.2.8 (6f01265ca03a6b9d7f3b7f759d8894bb9dbb6840) luminous
(stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x102) [0x55acc062cab2]
 2: (BlueStore::_kv_sync_thread()+0x24b2) [0x55acc04ba332]
 3: (BlueStore::KVSyncThread::entry()+0xd) [0x55acc04fa7ed]
 4: (()+0x7494) [0x7f6aab206494]
 5: (clone()+0x3f) [0x7f6aaa28dacf]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux