Is a single shared main device. Sadly I had already rebuilt the failed OSD's to bring me back in the green after a while. I have just tried a few restarts and none are failing (seems after a rebuild using 15.2.2 they are stable?) I don't have any other servers/OSD's I am willing to risk not starting right this minute ,if it does happen again I will grab the logs. @Dan yeah is using lz4 Thanks ---- On Wed, 20 May 2020 20:30:27 +0800 Igor Fedotov <ifedotov@xxxxxxx> wrote ---- I don't believe compression is related to be honest. Wondering if these OSDs have standalone WAL and/or DB devices or just a single shared main device. Also could you please set debug-bluefs/debug-bluestore to 20 and collect startup log for broken OSD. Kind regards, Igor On 5/20/2020 3:27 PM, Ashley Merrick wrote: Thanks, fyi the OSD's that went down back two pools, an Erasure code Meta (RBD) and cephFS Meta. The cephFS Pool does have compresison enabled ( I noticed it mentioned in the ceph tracker) Thanks ---- On Wed, 20 May 2020 20:17:33 +0800 Igor Fedotov mailto:ifedotov@xxxxxxx wrote ---- Hi Ashley, looks like this is a regression. Neha observed similar error(s) during here QA run, see https://tracker.ceph.com/issues/45613 Please preserve broken OSDs for a while if possible, likely I'll come back to you for more information to troubleshoot. Thanks, Igor On 5/20/2020 1:26 PM, Ashley Merrick wrote: > So reading online it looked a dead end error, so I recreated the 3 OSD's on that node and now working fine after a reboot. > > > > However I restarted the next server with 3 OSD's and one of them is now facing the same issue. > > > > Let me know if you need any more logs. > > > > Thanks > > > > ---- On Wed, 20 May 2020 17:02:31 +0800 Ashley Merrick <mailto:singapore@xxxxxxxxxxxxxx> wrote ---- > > > I just upgraded a cephadm cluster from 15.2.1 to 15.2.2. > > > > Everything went fine on the upgrade, however after restarting one node that has 3 OSD's for ecmeta two of the 3 ODS's now wont boot with the following error: > > > > May 20 08:29:42 sn-m01 bash[6833]: debug 2020-05-20T08:29:42.598+0000 7fbcc46f7ec0 4 rocksdb: [db/version_set.cc:3757] Recovered from manifest succeeded,manifest_file_number is 2768, next_file_number is 2775, last_sequence is 188026749, log_number is 2767,prev_log_number is 0,max_column_family is 0,min_log_number_to_keep is 0 > > May 20 08:29:42 sn-m01 bash[6833]: debug 2020-05-20T08:29:42.598+0000 7fbcc46f7ec0 4 rocksdb: [db/version_set.cc:3766] Column family [default] (ID 0), log number is 2767 > > May 20 08:29:42 sn-m01 bash[6833]: debug 2020-05-20T08:29:42.598+0000 7fbcc46f7ec0 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1589963382599157, "job": 1, "event": "recovery_started", "log_files": [2769]} > > May 20 08:29:42 sn-m01 bash[6833]: debug 2020-05-20T08:29:42.598+0000 7fbcc46f7ec0 4 rocksdb: [db/db_impl_open.cc:583] Recovering log #2769 mode 0 > > May 20 08:29:42 sn-m01 bash[6833]: debug 2020-05-20T08:29:42.598+0000 7fbcc46f7ec0 3 rocksdb: [db/db_impl_open.cc:518] db/002769.log: dropping 537526 bytes; Corruption: error in middle of record > > May 20 08:29:42 sn-m01 bash[6833]: debug 2020-05-20T08:29:42.598+0000 7fbcc46f7ec0 3 rocksdb: [db/db_impl_open.cc:518] db/002769.log: dropping 32757 bytes; Corruption: missing start of fragmented record(1) > > May 20 08:29:42 sn-m01 bash[6833]: debug 2020-05-20T08:29:42.602+0000 7fbcc46f7ec0 3 rocksdb: [db/db_impl_open.cc:518] db/002769.log: dropping 32757 bytes; Corruption: missing start of fragmented record(1) > > May 20 08:29:42 sn-m01 bash[6833]: debug 2020-05-20T08:29:42.602+0000 7fbcc46f7ec0 3 rocksdb: [db/db_impl_open.cc:518] db/002769.log: dropping 32757 bytes; Corruption: missing start of fragmented record(1) > > May 20 08:29:42 sn-m01 bash[6833]: debug 2020-05-20T08:29:42.602+0000 7fbcc46f7ec0 3 rocksdb: [db/db_impl_open.cc:518] db/002769.log: dropping 32757 bytes; Corruption: missing start of fragmented record(1) > > May 20 08:29:42 sn-m01 bash[6833]: debug 2020-05-20T08:29:42.602+0000 7fbcc46f7ec0 3 rocksdb: [db/db_impl_open.cc:518] db/002769.log: dropping 32757 bytes; Corruption: missing start of fragmented record(1) > > May 20 08:29:42 sn-m01 bash[6833]: debug 2020-05-20T08:29:42.602+0000 7fbcc46f7ec0 3 rocksdb: [db/db_impl_open.cc:518] db/002769.log: dropping 32757 bytes; Corruption: missing start of fragmented record(1) > > May 20 08:29:42 sn-m01 bash[6833]: debug 2020-05-20T08:29:42.602+0000 7fbcc46f7ec0 3 rocksdb: [db/db_impl_open.cc:518] db/002769.log: dropping 23263 bytes; Corruption: missing start of fragmented record(2) > > May 20 08:29:42 sn-m01 bash[6833]: debug 2020-05-20T08:29:42.602+0000 7fbcc46f7ec0 4 rocksdb: [db/db_impl.cc:390] Shutdown: canceling all background work > > May 20 08:29:42 sn-m01 bash[6833]: debug 2020-05-20T08:29:42.602+0000 7fbcc46f7ec0 4 rocksdb: [db/db_impl.cc:563] Shutdown complete > > May 20 08:29:42 sn-m01 bash[6833]: debug 2020-05-20T08:29:42.602+0000 7fbcc46f7ec0 -1 rocksdb: Corruption: error in middle of record > > May 20 08:29:42 sn-m01 bash[6833]: debug 2020-05-20T08:29:42.602+0000 7fbcc46f7ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) _open_db erroring opening db: > > May 20 08:29:42 sn-m01 bash[6833]: debug 2020-05-20T08:29:42.602+0000 7fbcc46f7ec0 1 bdev(0x558a28dd0700 /var/lib/ceph/osd/ceph-0/block) close > > May 20 08:29:42 sn-m01 bash[6833]: debug 2020-05-20T08:29:42.870+0000 7fbcc46f7ec0 1 bdev(0x558a28dd0000 /var/lib/ceph/osd/ceph-0/block) close > > May 20 08:29:43 sn-m01 bash[6833]: debug 2020-05-20T08:29:43.118+0000 7fbcc46f7ec0 -1 osd.0 0 OSD:init: unable to mount object store > > May 20 08:29:43 sn-m01 bash[6833]: debug 2020-05-20T08:29:43.118+0000 7fbcc46f7ec0 -1 ** ERROR: osd init failed: (5) Input/output error > > > > Have I hit a bug, or is there something I can do to try and fix these OSD's? > > > > Thanks > _______________________________________________ > ceph-users mailing list -- mailto:mailto:ceph-users@xxxxxxx > To unsubscribe send an email to mailto:mailto:ceph-users-leave@xxxxxxx > _______________________________________________ > ceph-users mailing list -- mailto:ceph-users@xxxxxxx > To unsubscribe send an email to mailto:ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx