Re: 1 MDS report slow metadata IOs

Eugen Block <eblock@xxxxxx> · Wed, 06 Oct 2021 07:58:12 +0000

what is causing the slow MDS metadata IOs ?

Your flapping OSDs.

currently, there are 2 mds and 3 monitors deployed ..
would it help to just one mds and one monitor ?

No, you need to figure out why your OSDs crash. More details about  
your setup (ceph version, deployment method, hardware resources) and  
the logs from a crashing OSD could help identify the issue.

Zitat von Abdelillah Asraoui <aasraoui@xxxxxxxxx>:

The osds are continuously flapping up/down due to the slow MDS metadata IOs
..
what is causing the slow MDS metadata IOs ?
currently, there are 2 mds and 3 monitors deployed ..
would it help to just one mds and one monitor ?

thanks!

On Tue, Oct 5, 2021 at 1:42 PM Eugen Block <eblock@xxxxxx> wrote:

All your PGs are inactive, if two of four OSDs are down and you
probably have a pool size of 3 then no IO can be served. You’d need at
least three up ODSs to resolve that.

Zitat von Abdelillah Asraoui <aasraoui@xxxxxxxxx>:

> Ceph is reporting warning on slow metdataIOs on one of the MDS server,
> this is
>
> a new cluster with no upgrade..
>
> Anyone has encountered this and is there a workaround ..
>
> ceph -s
>
>   cluster:
>
>     id:     801691e6xx-x-xx-xx-xx
>
>     health: HEALTH_WARN
>
>             1 MDSs report slow metadata IOs
>
>             noscrub,nodeep-scrub flag(s) set
>
>             2 osds down
>
>             2 hosts (2 osds) down
>
>             Reduced data availability: 97 pgs inactive, 66 pgs peering,
53
> pgs stale
>
>             Degraded data redundancy: 31 pgs undersized
>
>             2 slow ops, oldest one blocked for 30 sec, osd.0 has slow ops
>
>
>
>   services:
>
>     mon: 3 daemons, quorum a,c,f (age 15h)
>
>     mgr: a(active, since 17h)
>
>     mds: myfs:1 {0=myfs-a=up:creating} 1 up:standby
>
>     osd: 4 osds: 2 up (since 36s), 4 in (since 10h)
>
>          flags noscrub,nodeep-scrub
>
>
>
>   data:
>
>     pools:   4 pools, 97 pgs
>
>     objects: 0 objects, 0 B
>
>     usage:   1.0 GiB used, 1.8 TiB / 1.8 TiB avail
>
>     pgs:     100.000% pgs not active
>
>              44 creating+peering
>
>              31 stale+undersized+peered
>
>              22 stale+creating+peering
>
>
>
>   progress:
>
>     Rebalancing after osd.2 marked in (10h)
>
>       [............................]
>
>     Rebalancing after osd.3 marked in (10h)
>
>       [............................]
>
>
> Thanks!
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx