Re: [nautilus][mds] MDS fall into ReadOnly mode

Frank Yu <flyxiaoyu@xxxxxxxxx> · Thu, 30 Jul 2020 21:01:34 +0800

have fix it according
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-May/034946.html.
please ignore this thread.

On Thu, Jul 30, 2020 at 7:52 PM Frank Yu <flyxiaoyu@xxxxxxxxx> wrote:

> No, I think I should get the reason why it fall into read only, and found
> the correct method to fix it.... Just restart the mds is dangerous, I think.
>
>
> On Thu, Jul 30, 2020 at 7:35 PM sathvik vutukuri <7vik.sathvik@xxxxxxxxx>
> wrote:
>
>> Have you tried restart of the MDS.
>>
>> On Thu, 30 Jul 2020, 16:40 Frank Yu, <flyxiaoyu@xxxxxxxxx> wrote:
>>
>>> I got some error from mds.log as below:
>>>
>>> 2020-07-30 18:14:38.574 7f6473346700  0 log_channel(cluster) log [WRN] :
>>> 94
>>> slow requests, 0 included below; oldest blocked for > 619910.524984 secs
>>> 2020-07-30 18:14:43.574 7f6473346700  0 log_channel(cluster) log [WRN] :
>>> 94
>>> slow requests, 0 included below; oldest blocked for > 619915.525079 secs
>>> 2020-07-30 18:14:44.835 7f646f33e700 -1 mds.0.159432 unhandled write
>>> error
>>> (90) Message too long, force readonly...
>>> 2020-07-30 18:14:44.835 7f646f33e700  1 mds.0.cache force file system
>>> read-only
>>> 2020-07-30 18:14:44.835 7f646f33e700  0 log_channel(cluster) log [WRN] :
>>> force file system read-only
>>> 2020-07-30 18:15:18.000 7f6473346700  0 log_channel(cluster) log [WRN] :
>>> 114 slow requests, 5 included below; oldest blocked for > 619949.950199
>>> secs
>>>
>>>
>>> On Thu, Jul 30, 2020 at 6:55 PM Frank Yu <flyxiaoyu@xxxxxxxxx> wrote:
>>>
>>> > Hi guys,
>>> >
>>> > I have a ceph cluster with three MDS servers, two of them in active
>>> > status, while the left one is in standby-replay mode. Today I found the
>>> > message '1 MDSs are read only' show up when check the cluster status
>>> with
>>> > 'ceph -s', details as below:
>>> >
>>> > # ceph -s
>>> >   cluster:
>>> >     id:     3d43e9a5-50dc-4f84-9493-656bf4f06f8c
>>> >     health: HEALTH_WARN
>>> >             5 clients failing to advance oldest client/flush tid
>>> >             1 MDSs are read only
>>> >             2 MDSs report slow requests
>>> >             2 MDSs behind on trimming
>>> >             BlueFS spillover detected on 33 OSD(s)
>>> >
>>> >   services:
>>> >     mon: 3 daemons, quorum bjcpu-001,bjcpu-002,bjcpu-003 (age 3M)
>>> >     mgr: bjcpu-001.xxxx.io(active, since 3M), standbys:
>>> bjcpu-003.xxxx.io,
>>> > bjcpu-002.xxxx.io
>>> >     mds: cephfs:2 {0=bjcpu-003.xxxx.io=up:active,1=bjcpu-001.xxxx.io
>>> =up:active}
>>> > 1 up:standby-replay
>>> >     osd: 48 osds: 48 up (since 7w), 48 in (since 7M)
>>> >
>>> >   data:
>>> >     pools:   3 pools, 2304 pgs
>>> >     objects: 301.35M objects, 70 TiB
>>> >     usage:   246 TiB used, 280 TiB / 527 TiB avail
>>> >     pgs:     2295 active+clean
>>> >              9    active+clean+scrubbing+deep
>>> >
>>> >   io:
>>> >     client:   254 B/s rd, 44 MiB/s wr, 0 op/s rd, 15 op/s wr
>>> >
>>> > What should I do to fix the error message? it seems the cluster still
>>> > works fine(can read and write).
>>> >
>>> > Many thanks
>>> >
>>> >
>>> > --
>>> > Regards
>>> > Frank Yu
>>> >
>>>
>>>
>>> --
>>> Regards
>>> Frank Yu
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>>
>>
>
> --
> Regards
> Frank Yu
>

-- 
Regards
Frank Yu
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx