Re: Ganesha NFS hangs on any rebalancing or degraded data redundancy

Eugen Block <eblock@xxxxxx> · Thu, 14 Oct 2021 06:44:11 +0000

Hi,

what does your 'ceph osd tree' look like and which rules are in place  
for the affected pools? Can you provide more details about those pools  
like size, min_size, replicated or erasure-coded?
The first thing coming to mind is min_size. For example, if you have  
six hosts and an erasure-coded pool with size = 6 and failure domain  
is host then the pool is degraded and won't recover because there are  
not enough hosts. Usually, min_size would be lower (k + 1) so IO would  
still be served even if a host goes down, but since we don't know  
anything about your cluster yet we can't really tell what's going on  
there.

Regards,
Eugen

Zitat von Jeff Turmelle <jefft@xxxxxxxxxxxxxxxx>:

We are using NFS-Ganesha to serve data from our Nautilus cluster to  
older clients.  We recently had an OSD fail and the NFS server will  
not respond while we have degraded data redundancy.  This also  
happens on the rare occasion when we have some lost objects on a PG.  
 Is this a known issue and is there a workaround?

—
Jeff Turmelle, Lead Systems Analyst
International Research Institute for Climate and Society  
<http://iri.columbia.edu/>
Columbia Climate School <https://climate.columbia.edu/>
cell: (845) 652-3461

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx