>>How can you see that the cache is filling up and you need to execute >>"echo 2 > /proc/sys/vm/drop_caches"? you can monitor number of ceph dentry in slabinfo here a small script I'm running in cron. #!/bin/bash if pidof -o %PPID -x "dropcephinodecache.sh">/dev/null; then echo "Process already running" exit 1; fi value=`cat /proc/slabinfo |grep 'ceph_dentry_info\|fuse_inode'|awk '/1/ {print $2}'|head -1` if [ "$value" -gt 500000 ];then echo "Flush inode cache" echo 2 > /proc/sys/vm/drop_caches fi ----- Mail original ----- De: "Marc Roos" <M.Roos@xxxxxxxxxxxxxxxxx> À: "transuranium.yue" <transuranium.yue@xxxxxxxxx>, "Zheng Yan" <ukernel@xxxxxxxxx> Cc: "ceph-users" <ceph-users@xxxxxxxxxxxxxx> Envoyé: Lundi 21 Janvier 2019 15:53:17 Objet: Re: MDS performance issue How can you see that the cache is filling up and you need to execute "echo 2 > /proc/sys/vm/drop_caches"? -----Original Message----- From: Yan, Zheng [mailto:ukernel@xxxxxxxxx] Sent: 21 January 2019 15:50 To: Albert Yue Cc: ceph-users Subject: Re: MDS performance issue On Mon, Jan 21, 2019 at 11:16 AM Albert Yue <transuranium.yue@xxxxxxxxx> wrote: > > Dear Ceph Users, > > We have set up a cephFS cluster with 6 osd machines, each with 16 8TB harddisk. Ceph version is luminous 12.2.5. We created one data pool with these hard disks and created another meta data pool with 3 ssd. We created a MDS with 65GB cache size. > > But our users are keep complaining that cephFS is too slow. What we observed is cephFS is fast when we switch to a new MDS instance, once the cache fills up (which will happen very fast), client became very slow when performing some basic filesystem operation such as `ls`. > It seems that clients hold lots of unused inodes their icache, which prevent mds from trimming corresponding objects from its cache. mimic has command "ceph daemon mds.x cache drop" to ask client to drop its cache. I'm also working on a patch that make kclient client release unused inodes. For luminous, there is not much we can do, except periodically run "echo 2 > /proc/sys/vm/drop_caches" on each client. > What we know is our user are putting lots of small files into the cephFS, now there are around 560 Million files. We didn't see high CPU wait on MDS instance and meta data pool just used around 200MB space. > > My question is, what is the relationship between the metadata pool and MDS? Is this performance issue caused by the hardware behind meta data pool? Why the meta data pool only used 200MB space, and we saw 3k iops on each of these three ssds, why can't MDS cache all these 200MB into memory? > > Thanks very much! > > > Best Regards, > > Albert > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com