>> This happens with ext4 or btrfs too. maybe this is related to io scheduler ? did you have compared cfq,deadline,noop scheduler ? noop should be fast with ssd. also what's is your sas/sata controller ? ----- Mail original ----- De: "Stefan Priebe" <s.priebe@xxxxxxxxxxxx> À: "Alexandre DERUMIER" <aderumier@xxxxxxxxx> Cc: ceph-devel@xxxxxxxxxxxxxxx, "Mark Nelson" <mark.nelson@xxxxxxxxxxx> Envoyé: Lundi 28 Mai 2012 21:48:34 Objet: Re: poor OSD performance using kernel 3.4 Am 28.05.2012 08:52, schrieb Alexandre DERUMIER: >> I think filestore journal parallel works only with btrfs. >> Other filesystem are writeahead. >>> ... you might be right but i can't change ceph's implementation. > > See my schema, > I think you see parallel writes, because you see flush write of first wave to disk, in the same time > of second wave write to journal. Yes i fulllý understand and agree - but still this should at least result in a constant bandwidth near max of underlying disk. >>> I totally aggree with you but this is just a test setup AND if you have >>> a big log file to copy let's say 100GB your journal will never be big >>> enough and the speed should never drop to 0MB/s. Also i see the correct >>> behaviour with 3.0.X where the speed is maxed to the underlying device. >>> So i still see no reason that with 3.4 the speed drops to 0MB/s and is >>> mostly 10-20MB/s instead of 130MB/s. > > Maybe something is wrong with 3.4, then your disk write more slowly. (xfs bug, sata driver controller bug, ...) This happens with ext4 or btrfs too. Squential write speed to FS is exactly the same under 3.0 and 3.4 using oflag=direct. 3.4: 10000+0 records in 10000+0 records out 10485760000 bytes (10 GB) copied, 41,4899 s, 253 MB/s 3.0: 10000+0 records in 10000+0 records out 10485760000 bytes (10 GB) copied, 40,861 s, 257 MB/s > maybe some local benchmark of your ssd with 3.4 can give some tips ? >>> How many disks (7,2K) do you have by osd ? >>>> One intel 520 SSD per OSD. > > I see some benchmark on internet about 150-300MB/s (depend of the blocksize). bench OSD shows around 260MB/s ceph osd tell X bench shows me a speed of 260MB/s under both kernels which corresponds to the dd from above. > Something must be wrong, Doing local benchmark can really help I think. > You can use sysbench-tools > https://github.com/tsuna/sysbench-tools > It make bench compare with nice graphs. Thx hopefully i'll find something. Stefan -- -- Alexandre D erumier Ingénieur Système Fixe : 03 20 68 88 90 Fax : 03 20 68 90 81 45 Bvd du Général Leclerc 59100 Roubaix - France 12 rue Marivaux 75002 Paris - France -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html