Hello everyone,
I've conducted some crash tests (unplugging drives, the machine,
terminating and restarting ceph systemd services) with Ceph 12.2.0
on Ubuntu and quite easily managed to corrupt what appears to be
rocksdb's log replay on a bluestore OSD:
# ceph-bluestore-tool fsck --path /var/lib/ceph/osd/ceph-2/
[...]
4 rocksdb:
[/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/version_set.cc:2859]
Recovered from manifest file:db/MANIFEST-000975
succeeded,manifest_file_number is 975, next_file_number is 1008,
last_sequence is 51965907, log_number is 0,prev_log_number is
0,max_column_family is 0
4 rocksdb:
[/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/version_set.cc:2867]
Column family [default] (ID 0), log number is 1005
4 rocksdb: EVENT_LOG_v1 {"time_micros": 1509298585082794, "job":
1, "event": "recovery_started", "log_files": [1003, 1005]}
4 rocksdb:
[/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl_open.cc:482]
Recovering log #1003 mode 0
4 rocksdb:
[/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl_open.cc:482]
Recovering log #1005 mode 0
3 rocksdb:
[/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl_open.cc:424]
db/001005.log: dropping 3225 bytes; Corruption: missing start of
fragmented record(2)
4 rocksdb:
[/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl.cc:217]
Shutdown: canceling all background work
4 rocksdb:
[/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl.cc:343]
Shutdown complete
-1 rocksdb: Corruption: missing start of fragmented record(2)
-1 bluestore(/var/lib/ceph/osd/ceph-2/) _open_db erroring opening
db:
1 bluefs umount
1 bdev(0x557f5b6a4240 /var/lib/ceph/osd/ceph-2//block) close
If I understand this right, rocksdb is just trying to replay WAL
type logs, of which presumably "001005.log" is corrupted. It then
throws an error that stops everything.
I did try to mount the bluestore, as I was assuming that would
probably where I'd find the rocksdb's files somewhere, but that also
doesn't seem possible:
#ceph-objectstore-tool --op fsck --data-path
/var/lib/ceph/osd/ceph-2/ --mountpoint /mnt/bluestore-repair/
fsck failed: (5) Input/output error
# ceph-objectstore-tool --op fuse --data-path
/var/lib/ceph/osd/ceph-2 --mountpoint /mnt/bluestore-repair/
Mount failed with '(5) Input/output error'
# ceph-objectstore-tool --op fuse --force --skip-journal-replay
--data-path /var/lib/ceph/osd/ceph-2 --mountpoint
/mnt/bluestore-repair/
Mount failed with '(5) Input/output error'
Adding --debug shows the ultimate culprit is just the above rocksdb
error again.
Q: Is there some way in which I can tell rockdb to truncate or
delete / skip the respective log entries? Or can I get access to
rocksdb('s files) in some other way to just manipulate it or delete
corrupted WAL files manually?
-Michael
|
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com