hmmm... more and more PGs are broken: # ceph health detail HEALTH_ERR 1 filesystem is degraded; 1 filesystem has a failed mds daemon; 1 filesystem is offline; insufficient standby MDS daemons available; 46 scrub errors; Possible data damage: 32 pgs inconsistent; 2625 daemons have recently crashed [WRN] FS_DEGRADED: 1 filesystem is degraded fs cephfs is degraded [WRN] FS_WITH_FAILED_MDS: 1 filesystem has a failed mds daemon fs cephfs has 1 failed mds [ERR] MDS_ALL_DOWN: 1 filesystem is offline fs cephfs is offline because no MDS is active for it. [WRN] MDS_INSUFFICIENT_STANDBY: insufficient standby MDS daemons available have 0; want 1 more [ERR] OSD_SCRUB_ERRORS: 46 scrub errors [ERR] PG_DAMAGED: Possible data damage: 32 pgs inconsistent pg 1.3 is active+clean+inconsistent, acting [4,15,25] pg 1.5 is active+clean+inconsistent, acting [22,8,11] pg 1.a is active+clean+inconsistent, acting [23,19,6] pg 1.10 is active+clean+inconsistent, acting [18,22,0] pg 1.1c is active+clean+inconsistent+failed_repair, acting [28,16,9] pg 1.1e is active+clean+inconsistent, acting [22,10,6] pg 1.26 is active+clean+inconsistent, acting [22,2,17] pg 1.35 is active+clean+inconsistent, acting [27,7,11] pg 1.37 is active+clean+inconsistent+failed_repair, acting [7,16,26] pg 1.3d is active+clean+inconsistent, acting [0,17,22] pg 5.47 is active+clean+inconsistent, acting [8,28,13] pg 5.90 is active+clean+inconsistent+failed_repair, acting [13,9,21] pg 5.a6 is active+clean+inconsistent, acting [20,19,8] pg 5.b0 is active+clean+inconsistent, acting [20,3,17] pg 5.b2 is active+clean+inconsistent, acting [24,11,9] pg 5.b3 is active+clean+inconsistent, acting [1,23,18] pg 5.d0 is active+clean+inconsistent, acting [27,4,14] pg 5.d2 is active+clean+inconsistent, acting [15,24,0] pg 11.5 is active+clean+inconsistent, acting [11,3,25] pg 11.17 is active+clean+inconsistent, acting [24,19,8] pg 16.0 is active+clean+inconsistent, acting [5,15,24] pg 16.2 is active+clean+inconsistent, acting [12,1,27] pg 16.15 is active+clean+inconsistent, acting [0,28,11] pg 16.17 is active+clean+inconsistent, acting [2,21,13] pg 16.1c is active+clean+inconsistent, acting [25,7,15] pg 16.25 is active+clean+inconsistent, acting [15,4,25] pg 16.2f is active+clean+inconsistent, acting [20,13,1] pg 16.38 is active+clean+inconsistent, acting [2,18,22] pg 16.3a is active+clean+inconsistent, acting [12,1,20] pg 16.3d is active+clean+inconsistent, acting [21,19,6] pg 16.3e is active+clean+inconsistent, acting [14,9,21] pg 16.3f is active+clean+inconsistent, acting [23,5,15] [WRN] RECENT_CRASH: 2625 daemons have recently crashed client.admin crashed on host pve06 at 2021-09-30T05:08:19.213324Z mds.pve05 crashed on host pve05 at 2021-09-30T06:09:49.543530Z mds.pve04 crashed on host pve04 at 2021-09-30T13:10:22.059405Z mds.pve04 crashed on host pve04 at 2021-09-30T13:10:26.077956Z mds.pve04 crashed on host pve04 at 2021-09-30T13:10:30.117664Z mds.pve04 crashed on host pve04 at 2021-09-30T13:10:34.149385Z mds.pve04 crashed on host pve04 at 2021-09-30T13:10:37.607766Z mds.pve04 crashed on host pve04 at 2021-09-30T13:10:41.639585Z mds.pve04 crashed on host pve04 at 2021-09-30T13:10:45.684791Z mds.pve04 crashed on host pve04 at 2021-09-30T13:10:49.711284Z mds.pve04 crashed on host pve04 at 2021-09-30T13:10:53.757538Z mds.pve04 crashed on host pve04 at 2021-09-30T13:10:57.622000Z mds.pve04 crashed on host pve04 at 2021-09-30T13:11:01.798656Z mds.pve04 crashed on host pve04 at 2021-09-30T13:11:05.821116Z mds.pve04 crashed on host pve04 at 2021-09-30T13:11:09.860788Z mds.pve04 crashed on host pve04 at 2021-09-30T13:11:13.903719Z mds.pve04 crashed on host pve04 at 2021-09-30T13:11:17.630383Z mds.pve04 crashed on host pve04 at 2021-09-30T13:11:21.948918Z mds.pve05 crashed on host pve05 at 2021-09-30T13:11:25.979666Z mds.pve05 crashed on host pve05 at 2021-09-30T13:11:30.013149Z mds.pve05 crashed on host pve05 at 2021-09-30T13:11:34.044069Z mds.pve05 crashed on host pve05 at 2021-09-30T13:11:37.633660Z mds.pve05 crashed on host pve05 at 2021-09-30T13:11:41.664662Z mds.pve05 crashed on host pve05 at 2021-09-30T13:11:45.690034Z mds.pve05 crashed on host pve05 at 2021-09-30T13:11:49.735077Z mds.pve05 crashed on host pve05 at 2021-09-30T13:11:53.765387Z mds.pve05 crashed on host pve05 at 2021-09-30T13:11:57.655313Z mds.pve05 crashed on host pve05 at 2021-09-30T13:12:01.812882Z mds.pve06 crashed on host pve06 at 2021-09-30T13:12:05.838469Z mds.pve06 crashed on host pve06 at 2021-09-30T13:12:09.874958Z and 2595 more for now, i have all three mds daemons are stoped. at the risk of making a fool of myself, but how do i check what data is in a PG? i have already done a backup at the beginning using "cephfs-journal-tool journal export backup.bin", there is only a limited backup of the data itself from the ceph. regards, volker. ________________________________ Von: Stefan Kooman <stefan@xxxxxx> Gesendet: Sonntag, 3. Oktober 2021 10:39:20 An: von Hoesslin, Volker; ceph-users@xxxxxxx Betreff: [URL wurde verändert] Re: MDS: corrupted header/values: decode past end of struct encoding: Malformed input Externe E-Mail! Öffnen Sie nur Links oder Anhänge von vertrauenswürdigen Absendern! On 10/1/21 14:07, von Hoesslin, Volker wrote: > is there any chance to fix this? there are some "advanced metadata > repair tools" > (https://sis-schwerin.de/externer-link/?href=https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/ > <https://sis-schwerin.de/externer-link/?href=https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/>) but > i'm not realy sure is it the right way to handle this issue? Are you sure the PGs that are inconsistent doesn't have anything to do with the MDS issues? What data is on those PGs? > i have created an "backup" bevor any tries with this command: > > cephfs-journal-tool journal export backup.bin > > maybe can i delete the mds database and recreate it? is this possible? The experts link you pasted shows how you can do this. But I would consider this a last resort. Do you have backups? Does increased debug level for the MDS show any more clues (debug_mds 20/20)? Gr. Stefan _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx