You can run this tool. Be sure to read the comments. https://github.com/ceph/ceph/blob/main/src/tools/cephfs/first-damage.py As of now what causes the damage is not yet known but we are trying to reproduce it. If your workload reliably produces the damage, a debug_mds=20 MDS log would be extremely helpful. On Wed, Nov 30, 2022 at 6:15 PM Stolte, Felix <f.stolte@xxxxxxxxxxxxx> wrote: > > Hi Patrick, > > it does seem like it. We are not using postgres on cephfs as far as i know. We narrowed it down to three damaged inodes, but files in question had been xlsx, pdf or pst. > > Do you have any suggestion how to fix this? > > Is there a way to scan the cephfs for damaged inodes? > > > --------------------------------------------------------------------------------------------- > --------------------------------------------------------------------------------------------- > Forschungszentrum Juelich GmbH > 52425 Juelich > Sitz der Gesellschaft: Juelich > Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 > Vorsitzender des Aufsichtsrats: MinDir Volker Rieke > Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), > Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, > Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior > --------------------------------------------------------------------------------------------- > --------------------------------------------------------------------------------------------- > > Am 30.11.2022 um 22:49 schrieb Patrick Donnelly <pdonnell@xxxxxxxxxx>: > > On Wed, Nov 30, 2022 at 3:10 PM Stolte, Felix <f.stolte@xxxxxxxxxxxxx> wrote: > > > Hey guys, > > our mds daemons are crashing constantly when someone is trying to delete a file: > > -26> 2022-11-29T12:32:58.807+0100 7f081b458700 -1 /build/ceph-16.2.10/src/mds/Server.cc<http://server.cc/>: In function 'void Server::_unlink_local(MDRequestRef&, CDentry*, CDentry*)' thread 7f081b458700 time 2022-11-29T12:32:58.808844+0100 > > 2022-11-29T12:32:58.807+0100 7f081b458700 4 mds.0.server handle_client_request client_request(client.1189402075:14014394 unlink #0x100197fa8e0/~$29.11. T.xlsx 2022-11-29T12:32:23.711889+0100 RETRY=1 caller_uid=133365, > > I observed that the corresponding object in the cephfs data pool does not exist. Basically our MDS Daemons are crashing each time, when somone tries to delete a file which does not exist in the data pool but metadata says otherwise. > > Any suggestions how to fix this problem? > > > Is this it? > > https://tracker.ceph.com/issues/38452 > > Are you running postgres on CephFS by chance? > > -- > Patrick Donnelly, Ph.D. > He / Him / His > Principal Software Engineer > Red Hat, Inc. > GPG: 19F28A586F808C2402351B93C3301A3E258DD79D > > -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx