On 03/24/2014 08:02 AM, Yan, Zheng wrote: > On Sun, Mar 23, 2014 at 8:39 PM, Sascha Frey <sf@xxxxxxxxxxx> wrote: >> Hi list, >> >> I'm new to ceph and so I installed a four node ceph cluster for testing >> purposes. >> >> Each node has two 6-core sandy bridge Xeons, 64 GiB of RAM, 6 15k rpm >> SAS drives, one SSD drive for journals and 10G ethernet. >> We're using Debian GNU/Linux 7.4 (Wheezy) with kernel 3.13 from Debian >> backports repository and Ceph 0.72.2-1~bpo70+1. >> >> Every node runs six OSDs (one for every SAS disk). The SSD is partitioned >> into six parts for journals. >> Monitors are three of the same nodes (no extra hardware for mons and MDS >> for testing). First, I used node #4 as a MDS and later I installed >> Ceph-MDS on all four nodes with set_max_mds=3. >> >> I did increase pg_num and pgp_num to 1200 each for both data and >> metadata pools. >> >> I mounted the cephfs on one node using the kernel client. >> Writing to a single big file is fast: >> >> $ dd if=/dev/zero of=bigfile bs=1M count=1M >> 1048576+0 records in >> 1048576+0 records out >> 1099511627776 bytes (1.1 TB) copied, 1240.52 s, 886 MB/s >> >> Reading is less fast: >> $ dd if=bigfile of=/dev/null bs=1M >> 1048576+0 records in >> 1048576+0 records out >> 1099511627776 bytes (1.1 TB) copied, 3226.8 s, 341 MB/s >> (during reading, the nodes are mostly idle (>90%, 1-1.8% wa)) >> >> After this, I tried to copy the linux kernel source tree (source and >> dest dirs both on cephfs, 600 MiB, 45k files): >> >> $ time cp -a linux-3.13.6 linux-3.13.6-copy >> >> real 35m34.184s >> user 0m1.884s >> sys 0m11.372s >> >> That's much too slow. >> The same process takes just a few seconds on one desktop class SATA >> drive. >> >> I can't see any load or I/O wait on any of the four nodes. I tried >> different mount options: >> >> mon1,mon2,mon3:/ on /export type ceph (rw,relatime,name=someuser,secret=<hidden>,nodcache,nofsc) >> mon1,mon2,mon3:/ on /export type ceph (rw,relatime,name=someuser,secret=<hidden>,dcache,fsc,wsize=10485760,rsize=10485760) >> >> Output of 'ceph status': >> ceph status >> cluster 32ea6593-8cd6-40d6-ac3b-7450f1d92d16 >> health HEALTH_OK >> monmap e1: 3 mons at {howard=xxx.yyy.zzz.199:6789/0,leonard=xxx.yyy.zzz.196:6789/0,penny=xxx.yyy.zzz.198:6789/0}, election epoch 32, quorum 0,1,2 howard,leonard,penny >> mdsmap e107: 1/1/1 up {0=penny=up:active}, 3 up:standby >> osdmap e276: 24 osds: 24 up, 24 in >> pgmap v8932: 2464 pgs, 3 pools, 1028 GB data, 514 kobjects >> 2061 GB used, 11320 GB / 13382 GB avail >> 2464 active+clean >> client io 119 MB/s rd, 509 B/s wr, 43 op/s >> >> >> I appreciate if someone may help me to find the reason for that >> odd behaviour. > > In your case, copying each file requires sending several requests to > the MDS/OSD, each request can take several to tens of millisecond. > That's why only about 20 files were copied per second. One option to > improve the overall speed is perform a parallel copy. (you can find > some scripts from google) I have observed the same behavior in our cluster, but by using GNU parallel as /mnt/ceph/linux-3.13.6#time parallel -j10 cp -r {} /mnt/ceph/copy/ ::: * to copy the source code reduce time to real 14m22.721s user 0m1.208s sys 0m7.200s Hope it helps. - Gurvinder > > Regards > Yan, Zheng > >> >> >> Cheers, >> Sascha >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com