Hi again, hopefully for the last time with problems. We had a MDS crash earlier with the MDS staying in failed state and used a command to reset the filesystem (this was wrong, I know now, thanks Patrick Donnelly for pointing this out). I did a full scrub on the filesystem and two files were damaged. One of those got repaired, but the following file keeps giving errors and can't be removed. What can I do now? Below some information. # ceph tell mds.atlassian-prod:0 damage ls [ { "damage_type": "backtrace", "id": 2244444901, "ino": 1099534008829, "path": "/app1/shared/data/repositories/11271/objects/41/8f82507a0737c611720ed224bcc8b7a24fda01" } ] Trying to repair the error (online research shows this should work for a backtrace damage type) ---------- # ceph tell mds.atlassian-prod:0 scrub start /app1/shared/data/repositories/11271 recursive,repair,force { "return_code": 0, "scrub_tag": "d10ead42-5280-4224-971e-4f3022e79278", "mode": "asynchronous" } Cluster logs after this ---------- 1/2/24 9:37:05 AM [INF] scrub summary: idle 1/2/24 9:37:02 AM [INF] scrub summary: idle+waiting paths [/app1/shared/data/repositories/11271] 1/2/24 9:37:01 AM [INF] scrub summary: active paths [/app1/shared/data/repositories/11271] 1/2/24 9:37:01 AM [INF] scrub summary: idle+waiting paths [/app1/shared/data/repositories/11271] 1/2/24 9:37:01 AM [INF] scrub queued for path: /app1/shared/data/repositories/11271 But the error doesn't disappear and still can't remove the file. On the client trying to remove the file (we got a backup) ---------- $ rm -f /mnt/shared_disk-app1/shared/data/repositories/11271/objects/41/8f82507a0737c611720ed224bcc8b7a24fda01 rm: cannot remove '/mnt/shared_disk-app1/shared/data/repositories/11271/objects/41/8f82507a0737c611720ed224bcc8b7a24fda01': Input/output error Best regards, Sake _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx