On Mon, 2022-08-15 at 08:33 +0000, Eugen Block wrote: > Hi, > > do you see high disk utilization on the OSD nodes? How is the load > on > the active MDS? How much RAM is configured for the MDS > (mds_cache_memory_limit)? > You can list all MDS sessions with 'ceph daemon mds.<MDS> session > ls' > to identify all your clients and 'ceph daemon mds.<MDS> > dump_blocked_ops' to show blocked requests. But simply killing > sessions isn't a solution, so first you need to find out where the > bottleneck is. Do you see hung requests or something? Anything in > 'dmesg' on the client side? > Looking at the MDS ops in flight, the majority are journal_and_reply: $ sudo ceph daemon mds.$(hostname) dump_ops_in_flight |grep 'flag_point' |sort |uniq -c 28 "flag_point": "failed to rdlock, waiting", 2 "flag_point": "failed to wrlock, waiting", 18 "flag_point": "failed to xlock, waiting", 418 "flag_point": "submit entry: journal_and_reply", Does anyone know where I can find more info as to what journal_and_reply means? Is it solely about reading and writing to the metadata pool, or is it waiting for OSDs to perform some action (like ensure a file is gonem, so that it can then write to metadata pool, perhaps)? If it is related to OSDs in some way, then I can go and focus on improving them (not that I shouldn't be doing that anyway, but just trying to work out where to focus). For example, maybe setting osd_op_queue_cut_off to high [1] might help?(osd_op_queue is already set to wpq.) I notice that when performance tanks, that the throughput on the metadata pool goes very spiky (including down to zero). We're not talking huge though, the range is between 0 and 150MB/sec, so almost nothing... which makes me think it is related to OSDs also. [1] https://docs.ceph.com/en/latest/rados/configuration/osd-config-ref/#confval-osd_op_queue_cut_off Many thanks! -c _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx