Re: Client failing to respond to capability release

Dhairya Parmar <dparmar@xxxxxxxxxx> · Wed, 23 Aug 2023 12:35:19 +0530

Hi Frank,

This usually happens when the client is buggy/unresponsive. This warning is
triggered when the client fails to respond to MDS's request to release caps
in time which is determined by session_timeout(defaults to 60 secs). Did
you make any config changes?

*Dhairya Parmar*

Associate Software Engineer, CephFS

Red Hat Inc. <https://www.redhat.com/>

dparmar@xxxxxxxxxx
<https://www.redhat.com/>

On Tue, Aug 22, 2023 at 9:12 PM Frank Schilder <frans@xxxxxx> wrote:

> Hi all,
>
> I have this warning the whole day already (octopus latest cluster):
>
> HEALTH_WARN 4 clients failing to respond to capability release; 1 pgs not
> deep-scrubbed in time
> [WRN] MDS_CLIENT_LATE_RELEASE: 4 clients failing to respond to capability
> release
>     mds.ceph-24(mds.1): Client sn352.hpc.ait.dtu.dk:con-fs2-hpc failing
> to respond to capability release client_id: 145698301
>     mds.ceph-24(mds.1): Client sn463.hpc.ait.dtu.dk:con-fs2-hpc failing
> to respond to capability release client_id: 189511877
>     mds.ceph-24(mds.1): Client sn350.hpc.ait.dtu.dk:con-fs2-hpc failing
> to respond to capability release client_id: 189511887
>     mds.ceph-24(mds.1): Client sn403.hpc.ait.dtu.dk:con-fs2-hpc failing
> to respond to capability release client_id: 231250695
>
> If I look at the session info from mds.1 for these clients I see this:
>
> # ceph tell mds.1 session ls | jq -c '[.[] | {id: .id, h:
> .client_metadata.hostname, addr: .inst, fs: .client_metadata.root, caps:
> .num_caps, req: .request_load_avg}]|sort_by(.caps)|.[]' | grep -e 145698301
> -e 189511877 -e 189511887 -e 231250695
> {"id":189511887,"h":"sn350.hpc.ait.dtu.dk","addr":"client.189511887 v1:
> 192.168.57.221:0/4262844211","fs":"/hpc/groups","caps":2,"req":0}
> {"id":231250695,"h":"sn403.hpc.ait.dtu.dk","addr":"client.231250695 v1:
> 192.168.58.18:0/1334540218","fs":"/hpc/groups","caps":3,"req":0}
> {"id":189511877,"h":"sn463.hpc.ait.dtu.dk","addr":"client.189511877 v1:
> 192.168.58.78:0/3535879569","fs":"/hpc/groups","caps":4,"req":0}
> {"id":145698301,"h":"sn352.hpc.ait.dtu.dk","addr":"client.145698301 v1:
> 192.168.57.223:0/2146607320","fs":"/hpc/groups","caps":7,"req":0}
>
> We have mds_min_caps_per_client=4096, so it looks like the limit is well
> satisfied. Also, the file system is pretty idle at the moment.
>
> Why and what exactly is the MDS complaining about here?
>
> Thanks and best regards.
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx