Hello, On Wed, 1 Nov 2017 09:30:06 +0100 Michael wrote: > Hello everyone, > > I've conducted some crash tests (unplugging drives, the machine, Your exact system configuration (HW, drives, controller, settings, etc) would be interesting as I can think of plenty scenarios on how to corrupt things that normally shouldn't be affected by such actions. > terminating and restarting ceph systemd services) with Ceph 12.2.0 on Now that bit is quite disconcerting, though you're one release behind the curve and from what I read .2 has plenty more bug fixes coming. Christian > Ubuntu and quite easily managed to corrupt what appears to be rocksdb's > log replay on a bluestore OSD: > > # ceph-bluestore-tool fsck --path /var/lib/ceph/osd/ceph-2/ > [...] > 4 rocksdb: > [/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/version_set.cc:2859] > Recovered from manifest file:db/MANIFEST-000975 > succeeded,manifest_file_number is 975, next_file_number is 1008, > last_sequence is 51965907, log_number is 0,prev_log_number is > 0,max_column_family is 0 > 4 rocksdb: > [/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/version_set.cc:2867] > Column family [default] (ID 0), log number is 1005 > 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1509298585082794, "job": 1, > "event": "recovery_started", "log_files": [1003, 1005]} > 4 rocksdb: > [/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl_open.cc:482] > Recovering log #1003 mode 0 > 4 rocksdb: > [/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl_open.cc:482] > Recovering log #1005 mode 0 > 3 rocksdb: > [/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl_open.cc:424] > db/001005.log: dropping 3225 bytes; Corruption: missing start of > fragmented record(2) > 4 rocksdb: > [/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl.cc:217] Shutdown: > canceling all background work > 4 rocksdb: > [/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl.cc:343] Shutdown > complete > -1 rocksdb: Corruption: missing start of fragmented record(2) > -1 bluestore(/var/lib/ceph/osd/ceph-2/) _open_db erroring opening db: > 1 bluefs umount > 1 bdev(0x557f5b6a4240 /var/lib/ceph/osd/ceph-2//block) close > > If I understand this right, rocksdb is just trying to replay WAL type > logs, of which presumably "001005.log" is corrupted. It then throws an > error that stops everything. > > I did try to mount the bluestore, as I was assuming that would probably > where I'd find the rocksdb's files somewhere, but that also doesn't seem > possible: > > #ceph-objectstore-tool --op fsck --data-path /var/lib/ceph/osd/ceph-2/ > --mountpoint /mnt/bluestore-repair/ > fsck failed: (5) Input/output error > # ceph-objectstore-tool --op fuse --data-path /var/lib/ceph/osd/ceph-2 > --mountpoint /mnt/bluestore-repair/ > Mount failed with '(5) Input/output error' > # ceph-objectstore-tool --op fuse --force --skip-journal-replay > --data-path /var/lib/ceph/osd/ceph-2 --mountpoint /mnt/bluestore-repair/ > Mount failed with '(5) Input/output error' > > Adding --debug shows the ultimate culprit is just the above rocksdb > error again. > > Q: Is there some way in which I can tell rockdb to truncate or delete / > skip the respective log entries? Or can I get access to rocksdb('s > files) in some other way to just manipulate it or delete corrupted WAL > files manually? > > -Michael > -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Rakuten Communications _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com