On Sat, Dec 19, 2015 at 4:34 AM, Don Waterloo <don.waterloo@xxxxxxxxx> wrote: > I have 3 systems w/ a cephfs mounted on them. > And i am seeing material 'lag'. By 'lag' i mean it hangs for little bits of > time (1s, sometimes 5s). > But very non repeatable. > > If i run > time find . -type f -print0 | xargs -0 stat > /dev/null > it might take ~130ms. > But, it might take 10s. Once i've done it, it tends to stay @ the ~130ms, > suggesting whatever data is now in cache. On the cases it hangs, if i remove > the stat, its hanging on the find of one file. It might hiccup 1 or 2 times > in the find across 10k files. > When operation hangs, do you see any 'slow request ...' log message in the cluster log. Besides, do have have multiple clients accessing the filesystem? which version of ceph do you use? Regards Yan, Zheng > This lag might affect e.g. 'cwd', writing a file, basically all operations. > > Does anyone have any suggestions? Its very irritating problem. I do no see > errors in dmesg. > > The 3 systems w/ the filesystem mounted are running Ubuntu 15.10 w/ > 4.3.0-040300-generic kernel. They are running cephfs from the kernel driver, > mounted in /etc/fstab as: > > 10.100.10.60,10.100.10.61,10.100.10.62:/ /cephfs ceph > _netdev,noauto,noatime,x-systemd.requires=network-online.target,x-systemd.automount,x-systemd.device-timeout=10,name=admin,secret=XXXX== > 0 2 > > I have 3 mds, 1 active, 2 standby. The 3 machines are also the mons > {nubo-1/-2/-3} are the ones that have the cephfs mounted. > > They have a 9K mtu between the systems, and i have checked with ping -s ### > -M do <dest> that there are no blackholes in size... up to 8954 works, and > and 8955 gives 'would fragment'. > > All the storage devices are 1TB Samsung SSD, and all are on sata. There is > no material load on the system while this is occurring (a bit of background > fs usage i guess, but its otherwise idle, just me). > > $ ceph status > cluster b23abffc-71c4-4464-9449-3f2c9fbe1ded > health HEALTH_OK > monmap e1: 3 mons at > {nubo-1=10.100.10.60:6789/0,nubo-2=10.100.10.61:6789/0,nubo-3=10.100.10.62:6789/0} > election epoch 1070, quorum 0,1,2 nubo-1,nubo-2,nubo-3 > mdsmap e587: 1/1/1 up {0=nubo-2=up:active}, 2 up:standby > osdmap e2346: 6 osds: 6 up, 6 in > pgmap v113350: 840 pgs, 6 pools, 143 GB data, 104 kobjects > 288 GB used, 5334 GB / 5622 GB avail > 840 active+clean > > I've checked and the network between them is perfect: no loss, ~no latency ( > << 1ms, they are adjacent on an L2 segment), as are all the osd [there are 6 > osd]. > > ceph osd tree > ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY > -1 5.48996 root default > -2 0.89999 host nubo-1 > 0 0.89999 osd.0 up 1.00000 1.00000 > -3 0.89999 host nubo-2 > 1 0.89999 osd.1 up 1.00000 1.00000 > -4 0.89999 host nubo-3 > 2 0.89999 osd.2 up 1.00000 1.00000 > -5 0.92999 host nubo-19 > 3 0.92999 osd.3 up 1.00000 1.00000 > -6 0.92999 host nubo-20 > 4 0.92999 osd.4 up 1.00000 1.00000 > -7 0.92999 host nubo-21 > 5 0.92999 osd.5 up 1.00000 1.00000 > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com