Re: ceph failing to write data - MDSs read only

Amudhan P <amudhan83@xxxxxxxxx> · Mon, 2 Jan 2023 16:32:38 +0530

Hi Kotresh,

The issue is fixed for now I followed  the steps below.

I have an unmounted kernel client and restarted mds service which brought
back mds to normal. But even after this "1 MDSs behind on trimming issue"
didn't solve I waited for about 20 - 30 mins which automatically fixed the
trimming issue and ceph status is healthy now.

I didn't modify the settings related to the MDS cache they are in their
default settings.

On Mon, Jan 2, 2023 at 10:54 AM Kotresh Hiremath Ravishankar <
khiremat@xxxxxxxxxx> wrote:

> The MDS requests the clients to release caps to trim caches when there is
> cache pressure or it
> might proactively request the client to release caps in some cases. But
> the client is failing to release the
> caps soon enough in your case.
>
> Few questions:
>
> 1. Have you tuned MDS cache configurations? If so please share.
> 2. Is this kernel client or fuse client?
> 3. Could you please share 'session ls' output?
> 4. Also share the MDS/Client logs.
>
> Sometimes dropping the caches (echo 3 > /proc/sys/vm/drop_caches if it's
> kclient) or unmount and mounting
> the problematic client  could fix the issue if it's acceptable.
>
> Thanks and Regards,
> Kotresh H R
>
> On Thu, Dec 29, 2022 at 4:35 PM Amudhan P <amudhan83@xxxxxxxxx> wrote:
>
>> Hi,
>>
>> Suddenly facing an issue with Ceph cluster I am using ceph version 16.2.6.
>> I couldn't find any solution for the issue below.
>> Any suggestions?
>>
>>
>>     health: HEALTH_WARN
>>             1 clients failing to respond to capability release
>>             1 clients failing to advance oldest client/flush tid
>>             1 MDSs are read only
>>             1 MDSs report slow requests
>>             1 MDSs behind on trimming
>>
>>   services:
>>     mon: 3 daemons, quorum strg-node1,strg-node2,strg-node3 (age 9w)
>>     mgr: strg-node1.ivkfid(active, since 9w), standbys: strg-node2.unyimy
>>     mds: 1/1 daemons up, 1 standby
>>     osd: 32 osds: 32 up (since 9w), 32 in (since 5M)
>>
>>   data:
>>     volumes: 1/1 healthy
>>     pools:   3 pools, 321 pgs
>>     objects: 13.19M objects, 45 TiB
>>     usage:   90 TiB used, 85 TiB / 175 TiB avail
>>     pgs:     319 active+clean
>>              2   active+clean+scrubbing+deep
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx