I have 3 systems w/ a cephfs mounted on them.
And i am seeing material 'lag'. By 'lag' i mean it hangs for little bits of time (1s, sometimes 5s).
But very non repeatable.
If i run
time find . -type f -print0 | xargs -0 stat > /dev/null
it might take ~130ms.
But, it might take 10s. Once i've done it, it tends to stay @ the ~130ms, suggesting whatever data is now in cache. On the cases it hangs, if i remove the stat, its hanging on the find of one file. It might hiccup 1 or 2 times in the find across 10k files.
This lag might affect e.g. 'cwd', writing a file, basically all operations.
Does anyone have any suggestions? Its very irritating problem. I do no see errors in dmesg.
The 3 systems w/ the filesystem mounted are running Ubuntu 15.10 w/ 4.3.0-040300-generic kernel. They are running cephfs from the kernel driver, mounted in /etc/fstab as:
10.100.10.60,10.100.10.61,10.100.10.62:/ /cephfs ceph _netdev,noauto,noatime,x-systemd.requires=network-online.target,x-systemd.automount,x-systemd.device-timeout=10,name=admin,secret=XXXX== 0 2
I have 3 mds, 1 active, 2 standby. The 3 machines are also the mons {nubo-1/-2/-3} are the ones that have the cephfs mounted.
They have a 9K mtu between the systems, and i have checked with ping -s ### -M do <dest> that there are no blackholes in size... up to 8954 works, and and 8955 gives 'would fragment'.
All the storage devices are 1TB Samsung SSD, and all are on sata. There is no material load on the system while this is occurring (a bit of background fs usage i guess, but its otherwise idle, just me).
$ ceph status
cluster b23abffc-71c4-4464-9449-3f2c9fbe1ded
health HEALTH_OK
monmap e1: 3 mons at {nubo-1=10.100.10.60:6789/0,nubo-2=10.100.10.61:6789/0,nubo-3=10.100.10.62:6789/0}
election epoch 1070, quorum 0,1,2 nubo-1,nubo-2,nubo-3
mdsmap e587: 1/1/1 up {0=nubo-2=up:active}, 2 up:standby
osdmap e2346: 6 osds: 6 up, 6 in
pgmap v113350: 840 pgs, 6 pools, 143 GB data, 104 kobjects
288 GB used, 5334 GB / 5622 GB avail
840 active+clean
I've checked and the network between them is perfect: no loss, ~no latency ( << 1ms, they are adjacent on an L2 segment), as are all the osd [there are 6 osd].
ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 5.48996 root default
-2 0.89999 host nubo-1
0 0.89999 osd.0 up 1.00000 1.00000
-3 0.89999 host nubo-2
1 0.89999 osd.1 up 1.00000 1.00000
-4 0.89999 host nubo-3
2 0.89999 osd.2 up 1.00000 1.00000
-5 0.92999 host nubo-19
3 0.92999 osd.3 up 1.00000 1.00000
-6 0.92999 host nubo-20
4 0.92999 osd.4 up 1.00000 1.00000
-7 0.92999 host nubo-21
5 0.92999 osd.5 up 1.00000 1.00000
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com