Hi,
what does your 'ceph osd tree' look like and which rules are in place
for the affected pools? Can you provide more details about those pools
like size, min_size, replicated or erasure-coded?
The first thing coming to mind is min_size. For example, if you have
six hosts and an erasure-coded pool with size = 6 and failure domain
is host then the pool is degraded and won't recover because there are
not enough hosts. Usually, min_size would be lower (k + 1) so IO would
still be served even if a host goes down, but since we don't know
anything about your cluster yet we can't really tell what's going on
there.
Regards,
Eugen
Zitat von Jeff Turmelle <jefft@xxxxxxxxxxxxxxxx>:
We are using NFS-Ganesha to serve data from our Nautilus cluster to
older clients. We recently had an OSD fail and the NFS server will
not respond while we have degraded data redundancy. This also
happens on the rare occasion when we have some lost objects on a PG.
Is this a known issue and is there a workaround?
—
Jeff Turmelle, Lead Systems Analyst
International Research Institute for Climate and Society
<http://iri.columbia.edu/>
Columbia Climate School <https://climate.columbia.edu/>
cell: (845) 652-3461
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx