On 14-09-2023 10:03, Xiubo Li wrote:
<----- snip ----->
Okay, there were caps revocation stuck, which could cause the
create/lookup/getattr requests stuck and then reported as slow requests.
This should be a known issue as I am now working on
https://tracker.ceph.com/issues/50223. Currently I found one case, which
is the unlinking, could cause it, and these warning should disappeared
finally after the kclient could release the caps later, more detail
please see my comments on this tracker.
This tracker not finishes yet and there could be other cases that could
cause it, and I will continue working on it this week and next.
Ah, well, in that case, you're in for a treat ;-). We have a mail
cluster (dovecot on CephFS) that is spamming us continuously (during
work hours) with this. We could send you (a lot of) logs if you want to
have more examples of production setups. This behavior has not changed
between all the kernels we have tried (currently 6.5). Shall I upload
them to tracker 50223?
Would be great to get this fixed. Let us know if you need us to test
anything.
On a side note: separate metrics for the kclient for "# acquired caps"
and # released caps" might be useful as well. As a kclient can both be
very busy releasing caps, but also acquiring new ones. Just a total #
caps does not give this insight.
Thanks,
Gr. Stefan
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx