Re: MDS crash

Alexey GERASIMOV <alexey.gerasimov@xxxxxxxxxxxxxxx> · Sat, 27 Apr 2024 11:34:24 +0000

Colleagues, thank you for the advice to check the operability of MGRs. In fact, it is strange also: we checked our nodes for the network issues (ip connectivity, sockets, ACL, DNS) and find nothing wrong - but suddenly just the restart of all MGRs solved the problem with stale PGs and with ceph commands hang!

So, we are at the start point again - ceph is working except MDS daemons crash. But now we see some additional errors in MDS logs when try to start the daemon:

dir 0x1000dd10fa0 object missing on disk; some files may be lost (/volumes/csi/csi-vol-2eb40f89-f2e1-11ee-b657-3aa98da4c4a6/1080803d-1277-4ad8-ae80-a004bd3a5699/gallery/pc-12083932925583528732)

dir 0x1000dd10f9d object missing on disk; some files may be lost (/volumes/csi/csi-vol-2eb40f89-f2e1-11ee-b657-3aa98da4c4a6/1080803d-1277-4ad8-ae80-a004bd3a5699/cadserver-filevault/project-files/661fb14d341d3746ea5c2a8f

 I promiced to create the bug, so will do it later a bit. But should I try to do something more from my side also?  What I did exactly last time:

cephfs-journal-tool journal reset
cephfs-table-tool all reset session
cephfs-data-scan scan_extents
cephfs-data-scan scan_inodes
cephfs-data-scan scan_links
cephfs-data-scan cleanup

And one more question: is it possible to access to cephfs content directly, without MDS?
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx