Re: 15.2.2 Upgrade - Corruption: error in middle of record

Ashley Merrick <singapore@xxxxxxxxxxxxxx> · Wed, 20 May 2020 21:39:09 +0800

Hey Igor,

The OSDs only back two metadata pools, so only hold a couple of MB of data (hence they was easy and quick to rebuild), there actually NVME LVM devices passed through QEMU into a VM (hence only 10GB and showing as rotational)

I have large 10TB disks that back the EC(RBD/FS) them self, but I don't dare touch them right now.

Thanks

---- On Wed, 20 May 2020 21:34:14 +0800 Igor Fedotov <ifedotov@xxxxxxx> wrote ----

Thanks!

So for now I can see the following similarities between you case
      and the ticket:

1) Single main spinner as an OSD backing device.

2) Corruption happens to RocksDB WAL file

3) OSD has user data compression enabled.

And one more question. Fro the following line:

May 20 06:05:14 sn-m03 bash[86466]: debug
      2020-05-20T06:05:14.466+0000 7f60da5a5ec0  1 bdev(0x558e8fce0000
      /var/lib/ceph/osd/ceph-26/block) open size 10733223936
      (0x27fc00000, 10 GiB) block_size 4096 (4 KiB) rotational discard
      supported

I can see that your main device is pretty small - 10GiB only?
      Which looks rather small for any real usage. Is this a correct
      size? 

Surprisingly failed QA run had 9.5 GiB volume too. Don't know if
      this has any meaning or just a coincidence though....

Thanks,

Igor

On 5/20/2020 4:01 PM, Ashley Merrick
      wrote:

I attached the log but was too big and got moderated.

Here is it in a paste bin : https://pastebin.pl/view/69b2beb9

I have cut the log to start from the point of the original
          upgrade.

Thanks

---- On Wed, 20 May 2020 20:55:51 +0800 Igor Fedotov mailto:ifedotov@xxxxxxx wrote ----

Dan, thanks for the info. Good to know. 

 Failed QA run in the ticket uses snappy though. 

 And in fact any stuff writing to process memory can 
              introduce data 
 corruption in the similar manner. 

 So will keep that in mind but IMO relation to compression
              is still not 
 evident... 

 Kind regards, 

 Igor 

 On 5/20/2020 3:32 PM, Dan van der Ster wrote: 
 > lz4 ? It's not obviously related, but I've seen it
              involved in really 
 > non-obvious ways: https://tracker.ceph.com/issues/39525 
 > 
 > -- dan 
 > 
 > On Wed, May 20, 2020 at 2:27 PM Ashley Merrick <mailto:singapore@xxxxxxxxxxxxxx>
              wrote: 
 >> Thanks, fyi the OSD's that went down back two
              pools, an Erasure code Meta (RBD) and cephFS Meta. The
              cephFS Pool does have compresison enabled ( I noticed it
              mentioned in the ceph tracker) 
 >> 
 >> 
 >> 
 >> Thanks 
 >> 
 >> 
 >> 
 >> 
 >> 
 >> ---- On Wed, 20 May 2020 20:17:33 +0800 Igor
              Fedotov <mailto:ifedotov@xxxxxxx>
              wrote ---- 
 >> 
 >> 
 >> 
 >> Hi Ashley, 
 >> 
 >> looks like this is a regression. Neha observed
              similar error(s) during 
 >> here QA run, see https://tracker.ceph.com/issues/45613 
 >> 
 >> 
 >> Please preserve broken OSDs for a while if
              possible, likely I'll come 
 >> back to you for more information to troubleshoot. 
 >> 
 >> 
 >> Thanks, 
 >> 
 >> Igor 
 >> 
 >> On 5/20/2020 1:26 PM, Ashley Merrick wrote: 
 >> 
 >>> So reading online it looked a dead end error,
              so I recreated the 3 OSD's on that node and now working
              fine after a reboot. 
 >>> 
 >>> 
 >>> 
 >>> However I restarted the next server with 3
              OSD's and one of them is now facing the same issue. 
 >>> 
 >>> 
 >>> 
 >>> Let me know if you need any more logs. 
 >>> 
 >>> 
 >>> 
 >>> Thanks 
 >>> 
 >>> 
 >>> 
 >>> ---- On Wed, 20 May 2020 17:02:31 +0800
              Ashley Merrick <mailto:mailto:singapore@xxxxxxxxxxxxxx>
              wrote ---- 
 >>> 
 >>> 
 >>> I just upgraded a cephadm cluster from 15.2.1
              to 15.2.2. 
 >>> 
 >>> 
 >>> 
 >>> Everything went fine on the upgrade, however
              after restarting one node that has 3 OSD's for ecmeta two
              of the 3 ODS's now wont boot with the following error: 
 >>> 
 >>> 
 >>> 
 >>> May 20 08:29:42 sn-m01 bash[6833]: debug
              2020-05-20T08:29:42.598+0000 7fbcc46f7ec0 4 rocksdb:
              [db/version_set.cc:3757] Recovered from manifest  succeeded,manifest_file_number is
              2768, next_file_number is 2775, last_sequence is
              188026749, log_number is 2767,prev_log_number is
              0,max_column_family is 0,min_log_number_to_keep is 0 
 >>> 
 >>> May 20 08:29:42 sn-m01 bash[6833]: debug
              2020-05-20T08:29:42.598+0000 7fbcc46f7ec0 4 rocksdb:
              [db/version_set.cc:3766] Column family [default] (ID 0),
              log number is 2767 
 >>> 
 >>> May 20 08:29:42 sn-m01 bash[6833]: debug
              2020-05-20T08:29:42.598+0000 7fbcc46f7ec0 4 rocksdb:
              EVENT_LOG_v1 {"time_micros": 1589963382599157, "job": 1,
              "event": "recovery_started", "log_files": [2769]} 
 >>> 
 >>> May 20 08:29:42 sn-m01 bash[6833]: debug
              2020-05-20T08:29:42.598+0000 7fbcc46f7ec0 4 rocksdb:
              [db/db_impl_open.cc:583] Recovering log #2769 mode 0 
 >>> 
 >>> May 20 08:29:42 sn-m01 bash[6833]: debug
              2020-05-20T08:29:42.598+0000 7fbcc46f7ec0 3 rocksdb:
              [db/db_impl_open.cc:518] db/002769.log: dropping 537526
              bytes; Corruption: error in middle of record 
 >>> 
 >>> May 20 08:29:42 sn-m01 bash[6833]: debug
              2020-05-20T08:29:42.598+0000 7fbcc46f7ec0 3 rocksdb:
              [db/db_impl_open.cc:518] db/002769.log: dropping 32757
              bytes; Corruption: missing start of fragmented record(1) 
 >>> 
 >>> May 20 08:29:42 sn-m01 bash[6833]: debug
              2020-05-20T08:29:42.602+0000 7fbcc46f7ec0 3 rocksdb:
              [db/db_impl_open.cc:518] db/002769.log: dropping 32757
              bytes; Corruption: missing start of fragmented record(1) 
 >>> 
 >>> May 20 08:29:42 sn-m01 bash[6833]: debug
              2020-05-20T08:29:42.602+0000 7fbcc46f7ec0 3 rocksdb:
              [db/db_impl_open.cc:518] db/002769.log: dropping 32757
              bytes; Corruption: missing start of fragmented record(1) 
 >>> 
 >>> May 20 08:29:42 sn-m01 bash[6833]: debug
              2020-05-20T08:29:42.602+0000 7fbcc46f7ec0 3 rocksdb:
              [db/db_impl_open.cc:518] db/002769.log: dropping 32757
              bytes; Corruption: missing start of fragmented record(1) 
 >>> 
 >>> May 20 08:29:42 sn-m01 bash[6833]: debug
              2020-05-20T08:29:42.602+0000 7fbcc46f7ec0 3 rocksdb:
              [db/db_impl_open.cc:518] db/002769.log: dropping 32757
              bytes; Corruption: missing start of fragmented record(1) 
 >>> 
 >>> May 20 08:29:42 sn-m01 bash[6833]: debug
              2020-05-20T08:29:42.602+0000 7fbcc46f7ec0 3 rocksdb:
              [db/db_impl_open.cc:518] db/002769.log: dropping 32757
              bytes; Corruption: missing start of fragmented record(1) 
 >>> 
 >>> May 20 08:29:42 sn-m01 bash[6833]: debug
              2020-05-20T08:29:42.602+0000 7fbcc46f7ec0 3 rocksdb:
              [db/db_impl_open.cc:518] db/002769.log: dropping 23263
              bytes; Corruption: missing start of fragmented record(2) 
 >>> 
 >>> May 20 08:29:42 sn-m01 bash[6833]: debug
              2020-05-20T08:29:42.602+0000 7fbcc46f7ec0 4 rocksdb:
              [db/db_impl.cc:390] Shutdown: canceling all background
              work 
 >>> 
 >>> May 20 08:29:42 sn-m01 bash[6833]: debug
              2020-05-20T08:29:42.602+0000 7fbcc46f7ec0 4 rocksdb:
              [db/db_impl.cc:563] Shutdown complete 
 >>> 
 >>> May 20 08:29:42 sn-m01 bash[6833]: debug
              2020-05-20T08:29:42.602+0000 7fbcc46f7ec0 -1 rocksdb:
              Corruption: error in middle of record 
 >>> 
 >>> May 20 08:29:42 sn-m01 bash[6833]: debug
              2020-05-20T08:29:42.602+0000 7fbcc46f7ec0 -1
              bluestore(/var/lib/ceph/osd/ceph-0) _open_db erroring
              opening db: 
 >>> 
 >>> May 20 08:29:42 sn-m01 bash[6833]: debug
              2020-05-20T08:29:42.602+0000 7fbcc46f7ec0 1
              bdev(0x558a28dd0700 /var/lib/ceph/osd/ceph-0/block) close 
 >>> 
 >>> May 20 08:29:42 sn-m01 bash[6833]: debug
              2020-05-20T08:29:42.870+0000 7fbcc46f7ec0 1
              bdev(0x558a28dd0000 /var/lib/ceph/osd/ceph-0/block) close 
 >>> 
 >>> May 20 08:29:43 sn-m01 bash[6833]: debug
              2020-05-20T08:29:43.118+0000 7fbcc46f7ec0 -1 osd.0 0
              OSD:init: unable to mount object store 
 >>> 
 >>> May 20 08:29:43 sn-m01 bash[6833]: debug
              2020-05-20T08:29:43.118+0000 7fbcc46f7ec0 -1 ** ERROR: osd
              init failed: (5) Input/output error 
 >>> 
 >>> 
 >>> 
 >>> Have I hit a bug, or is there something I can
              do to try and fix these OSD's? 
 >>> 
 >>> 
 >>> 
 >>> Thanks 
 >>>
              _______________________________________________ 
 >>> ceph-users mailing list -- http://mailto:mailto:mailto:ceph-users@xxxxxxx 
 >>> To unsubscribe send an email to http://mailto:mailto:mailto:ceph-users-leave@xxxxxxx 
 >>>
              _______________________________________________ 
 >>> ceph-users mailing list -- mailto:mailto:ceph-users@xxxxxxx 
 >>> To unsubscribe send an email to mailto:mailto:ceph-users-leave@xxxxxxx 
 >> _______________________________________________ 
 >> ceph-users mailing list -- mailto:ceph-users@xxxxxxx 
 >> To unsubscribe send an email to mailto:ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx