Update to Mimic with prior Snapshots leads to MDS damaged metadata

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I upgraded a ceph cluster to mimic yesterday according to the release
notes. Specifically I did stop all standby MDS and then restarted the
only active MDS with the new version.

The cluster was installed with luminous. Its cephfs volume had snapshots
prior to the update, but only one active MDS.

The post-installation steps failed though:
 ceph daemon mds.<id> scrub_path /
returned an error, which I corrected with
 ceph daemon mds.<id> scrub_path / repair

While
 ceph daemon mds.<id> scrub_path '~mdsdir'
did not show any error.


After some time, ceph health reported MDS damaged metadata:
> ceph tell mds.<id> damage ls | jq '.[].damage_type' | sort | uniq -c
    398 "backtrace"
    718 "dentry"

Examples of damage:

{
  "damage_type": "dentry",
  "id": 118195760,
  "ino": 1099513350198,
  "frag": "000100*",
  "dname":
"1524578400.M820820P705532.dovecot-15-hgjlx,S=425674,W=431250:2,RS",
  "snap_id": "head",
  "path":
"/path/to/mails/user/Maildir/.Trash/cur/1524578400.M820820P705532.dovecot-15-hgjlx,S=425674,W=431250:2,RS"
},
{
  "damage_type": "backtrace",
  "id": 121083841,
  "ino": 1099515215027,
  "path":
"/path/to/mails/other_user/Maildir/.Junk/cur/1528189963.M416032P698926.dovecot-15-xmpkh,S=4010,W=4100:2,Sab"
},


Directories with damage can still be listed by the kernel cephfs mount
(4.16.7), but not the fuse mount, which stalls.


Can anyone help? That's unfortunately a production cluster.

Regards,
 Tobias Florek
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux