Re: [ceph-users] mds.0.journaler.pq(ro) _finish_read got error -2

Patrick Donnelly <pdonnell@xxxxxxxxxx> · Tue, 12 Dec 2023 11:31:06 -0500

On Mon, Dec 11, 2023 at 6:38 AM Eugen Block <eblock@xxxxxx> wrote:
>
> Hi,
>
> I'm trying to help someone with a broken CephFS. We managed to recover
> basic ceph functionality but the CephFS is still inaccessible
> (currently read-only). We went through the disaster recovery steps but
> to no avail. Here's a snippet from the startup logs:
>
> ---snip---
> mds.0.41 Booting: 2: waiting for purge queue recovered
> mds.0.journaler.pq(ro) _finish_probe_end write_pos = 14797504512
> (header had 14789452521). recovered.
> mds.0.purge_queue operator(): open complete
> mds.0.purge_queue operator(): recovering write_pos
> monclient: get_auth_request con 0x55c280bc5c00 auth_method 0
> monclient: get_auth_request con 0x55c280ee0c00 auth_method 0
> mds.0.journaler.pq(ro) _finish_read got error -2
> mds.0.purge_queue _recover: Error -2 recovering write_pos
> mds.0.purge_queue _go_readonly: going readonly because internal IO
> failed: No such file or directory
> mds.0.journaler.pq(ro) set_readonly
> mds.0.41 unhandled write error (2) No such file or directory, force
> readonly...
> mds.0.cache force file system read-only
> force file system read-only
> ---snip---
>
> I've added the dev mailing list, maybe someone can give some advice
> how to continue from here (we could try to recover with an empty
> metadata pool). Or is this FS lost?

Looks like one of the purge queue journal objects was lost? Were other
objects lost? It would be helpful to know more about the circumstances
of this "broken CephFS"? What Ceph version?

-- 
Patrick Donnelly, Ph.D.
He / Him / His
Red Hat Partner Engineer
IBM, Inc.
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx