Stefan, As fio benchmark use directio (--direct) , maybe the writeback cache is not working ? perfcounters should give us the answer. ----- Mail original ----- De: "Josh Durgin" <josh.durgin@xxxxxxxxxxx> À: "Stefan Priebe" <s.priebe@xxxxxxxxxxxx> Cc: "Gregory Farnum" <greg@xxxxxxxxxxx>, "Alexandre DERUMIER" <aderumier@xxxxxxxxx>, "Sage Weil" <sage@xxxxxxxxxxx>, ceph-devel@xxxxxxxxxxxxxxx, "Mark Nelson" <mark.nelson@xxxxxxxxxxx> Envoyé: Lundi 2 Juillet 2012 22:30:19 Objet: Re: speedup ceph / scaling / find the bottleneck On 07/02/2012 12:22 PM, Stefan Priebe wrote: > Am 02.07.2012 18:51, schrieb Gregory Farnum: >> On Sun, Jul 1, 2012 at 11:12 PM, Stefan Priebe - Profihost AG >> <s.priebe@xxxxxxxxxxxx> wrote: >>> @sage / mark >>> How does the aggregation work? Does it work 4MB blockwise or target node >>> based? >> Aggregation is based on the 4MB blocks, and if you've got caching >> enabled then it's also not going to flush them out to disk very often >> if you're continuously updating the block — I don't remember all the >> conditions, but essentially, you'll run into dirty limits and it will >> asynchronously flush out the data based on a combination of how old it >> is, and how long it's been since some version of it was stable on >> disk. > Is there any way to check if rbd caching works correctly? For me the I/O > values do not change if i switch writeback on or of and it also doesn't > matter how large i set the cache size. > > ... If you add admin_socket=/path/to/admin_socket for your client running qemu (in that client's ceph.conf section or manually in the qemu command line) you can check that caching is enabled: ceph --admin-daemon /path/to/admin_socket show config | grep rbd_cache And see statistics it generates (look for cache) with: ceph --admin-daemon /path/to/admin_socket perfcounters_dump Josh >>> Ceph: >>> 2 VMs: >>> write: io=2234MB, bw=25405KB/s, iops=6351, runt= 90041msec >>> read : io=4760MB, bw=54156KB/s, iops=13538, runt= 90007msec >>> write: io=56372MB, bw=638402KB/s, iops=155, runt= 90421msec >>> read : io=86572MB, bw=981225KB/s, iops=239, runt= 90346msec >>> >>> write: io=2222MB, bw=25275KB/s, iops=6318, runt= 90011msec >>> read : io=4747MB, bw=54000KB/s, iops=13500, runt= 90008msec >>> write: io=55300MB, bw=626733KB/s, iops=153, runt= 90353msec >>> read : io=84992MB, bw=965283KB/s, iops=235, runt= 90162msec >> >> I can't quite tell what's going on here, can you describe the test in >> more detail? > > I've network booted my VM and then run the following command: > export DISK=/dev/vda; (fio --filename=$DISK --direct=1 --rw=randwrite > --bs=4k --size=200G --numjobs=50 --runtime=90 --group_reporting > --name=file1;fio --filename=$DISK --direct=1 --rw=randread --bs=4k > --size=200G --numjobs=50 --runtime=90 --group_reporting --name=file1;fio > --filename=$DISK --direct=1 --rw=write --bs=4M --size=200G --numjobs=50 > --runtime=90 --group_reporting --name=file1;fio --filename=$DISK > --direct=1 --rw=read --bs=4M --size=200G --numjobs=50 --runtime=90 > --group_reporting --name=file1 )|egrep " read| write" > > - write random 4k I/O > - read random 4k I/O > - write seq 4M I/O > - read seq 4M I/O > > Stefan -- -- Alexandre D e rumier Ingénieur Systèmes et Réseaux Fixe : 03 20 68 88 85 Fax : 03 20 68 90 88 45 Bvd du Général Leclerc 59100 Roubaix 12 rue Marivaux 75002 Paris -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html