Re: [nautilus][mds] MDS fall into ReadOnly mode

sathvik vutukuri <7vik.sathvik@xxxxxxxxx> · Thu, 30 Jul 2020 17:05:19 +0530

Have you tried restart of the MDS.

On Thu, 30 Jul 2020, 16:40 Frank Yu, <flyxiaoyu@xxxxxxxxx> wrote:

> I got some error from mds.log as below:
>
> 2020-07-30 18:14:38.574 7f6473346700  0 log_channel(cluster) log [WRN] : 94
> slow requests, 0 included below; oldest blocked for > 619910.524984 secs
> 2020-07-30 18:14:43.574 7f6473346700  0 log_channel(cluster) log [WRN] : 94
> slow requests, 0 included below; oldest blocked for > 619915.525079 secs
> 2020-07-30 18:14:44.835 7f646f33e700 -1 mds.0.159432 unhandled write error
> (90) Message too long, force readonly...
> 2020-07-30 18:14:44.835 7f646f33e700  1 mds.0.cache force file system
> read-only
> 2020-07-30 18:14:44.835 7f646f33e700  0 log_channel(cluster) log [WRN] :
> force file system read-only
> 2020-07-30 18:15:18.000 7f6473346700  0 log_channel(cluster) log [WRN] :
> 114 slow requests, 5 included below; oldest blocked for > 619949.950199
> secs
>
>
> On Thu, Jul 30, 2020 at 6:55 PM Frank Yu <flyxiaoyu@xxxxxxxxx> wrote:
>
> > Hi guys,
> >
> > I have a ceph cluster with three MDS servers, two of them in active
> > status, while the left one is in standby-replay mode. Today I found the
> > message '1 MDSs are read only' show up when check the cluster status with
> > 'ceph -s', details as below:
> >
> > # ceph -s
> >   cluster:
> >     id:     3d43e9a5-50dc-4f84-9493-656bf4f06f8c
> >     health: HEALTH_WARN
> >             5 clients failing to advance oldest client/flush tid
> >             1 MDSs are read only
> >             2 MDSs report slow requests
> >             2 MDSs behind on trimming
> >             BlueFS spillover detected on 33 OSD(s)
> >
> >   services:
> >     mon: 3 daemons, quorum bjcpu-001,bjcpu-002,bjcpu-003 (age 3M)
> >     mgr: bjcpu-001.xxxx.io(active, since 3M), standbys:
> bjcpu-003.xxxx.io,
> > bjcpu-002.xxxx.io
> >     mds: cephfs:2 {0=bjcpu-003.xxxx.io=up:active,1=bjcpu-001.xxxx.io
> =up:active}
> > 1 up:standby-replay
> >     osd: 48 osds: 48 up (since 7w), 48 in (since 7M)
> >
> >   data:
> >     pools:   3 pools, 2304 pgs
> >     objects: 301.35M objects, 70 TiB
> >     usage:   246 TiB used, 280 TiB / 527 TiB avail
> >     pgs:     2295 active+clean
> >              9    active+clean+scrubbing+deep
> >
> >   io:
> >     client:   254 B/s rd, 44 MiB/s wr, 0 op/s rd, 15 op/s wr
> >
> > What should I do to fix the error message? it seems the cluster still
> > works fine(can read and write).
> >
> > Many thanks
> >
> >
> > --
> > Regards
> > Frank Yu
> >
>
>
> --
> Regards
> Frank Yu
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx