This output seems typical for both active MDS servers: ---------------mds---------------- --mds_cache--- ------mds_log------ -mds_mem- -------mds_server------- mds_ -----objecter------ purg req rlat fwd inos caps exi imi |stry recy recd|subm evts segs repl|ino dn |hcr hcs hsr cre cat |sess|actv rd wr rdwr|purg| 0 0 0 6.0M 887k 1.0k 0 | 56 0 0 | 7 3.0k 139 0 |6.0M 6.0M| 0 154 0 0 0 | 48 | 0 0 14 0 | 0 0 11k 0 6.0M 887k 236 0 | 56 0 0 | 7 3.0k 142 0 |6.0M 6.0M| 0 99 0 0 0 | 48 | 1 0 31 0 | 0 0 0 0 6.0M 887k 718 0 | 56 0 0 | 5 3.0k 143 0 |6.0M 6.0M| 0 318 0 0 0 | 48 | 1 0 12 0 | 0 0 13k 0 6.0M 887k 3.4k 0 | 56 0 0 |197 3.2k 145 0 |6.0M 6.0M| 0 43 1 0 0 | 48 | 8 0 207 0 | 0 0 0 0 6.0M 884k 4.9k 0 | 56 0 0 | 0 3.2k 145 0 |6.0M 6.0M| 0 2 0 0 0 | 48 | 0 0 10 0 | 0 0 0 0 6.0M 884k 2.1k 0 | 56 0 0 | 6 3.2k 147 0 |6.0M 6.0M| 0 0 1 0 0 | 48 | 0 0 12 0 | 0 2 0 0 6.0M 882k 1.1k 0 | 56 0 0 | 75 3.3k 150 0 |6.0M 6.0M| 2 23 0 0 0 | 48 | 0 0 42 0 | 0 0 0 0 6.0M 880k 16 0 | 56 0 0 | 88 3.4k 152 0 |6.0M 6.0M| 0 48 0 0 0 | 48 | 3 0 115 0 | 0 1 2.4k 0 6.0M 878k 126 0 | 56 0 0 |551 2.8k 130 0 |6.0M 6.0M| 1 26 2 0 0 | 48 | 0 0 209 0 | 0 4 210 0 6.0M 874k 0 0 | 56 0 0 | 5 2.8k 131 0 |6.0M 6.0M| 4 14 0 0 0 | 48 | 0 0 488 0 | 0 1 891 0 6.0M 870k 12k 0 | 56 0 0 | 0 2.8k 131 0 |6.0M 6.0M| 1 33 0 0 0 | 48 | 0 0 0 0 | 0 5 15 2 6.0M 870k 8.2k 0 | 56 0 0 | 79 2.9k 134 0 |6.0M 6.0M| 5 27 1 0 0 | 48 | 0 0 22 0 | 0 1 68 0 6.0M 858k 0 0 | 56 0 0 | 49 2.9k 136 0 |6.0M 6.0M| 1 0 1 0 0 | 48 | 0 0 91 0 | 0 The metadata pool is still taking 64 MB/s writes. We have two active MDS servers, without pinning. mds_cache_memory_limit is set to 20 GB, which ought to be enough for anyone(tm) as only 24 GB of data is used in the metadata pool. Does that offer any kind of clue? On Thu, 8 Jul 2021 at 10:16, Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote: > Hi, > > That's interesting -- yes on a lightly loaded cluster the metadata IO > should be almost nil. > You can debug what is happening using ceph daemonperf on the active > MDS, e.g. https://pastebin.com/raw/n0iD8zXY > > (Use a wide terminal to show all the columns). > > Normally, lots of md io would indicate that the cache size is too > small for the workload; but since you said the clients are pretty > idle, this might not be the case for you. > > Cheers, Dan > > On Thu, Jul 8, 2021 at 9:36 AM Flemming Frandsen <dren.dk@xxxxxxxxx> > wrote: > > > > We have a nautilus cluster where any metadata write operation is very > slow. > > > > We're seeing very light load from clients, as reported by dumping ops in > > flight, often it's zero. > > > > We're also seeing about 100 MB/s writes to the metadata pool, constantly, > > for weeks on end, which seems excessive, as only 22GB is utilized. > > > > Should the writes to the metadata pool not quiet down when there's > nothing > > going on? > > > > Is there any way i can get information about why the MDSes are thrashing > so > > badly? > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > -- Flemming Frandsen - YAPH - http://osaa.dk - http://dren.dk/ _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx