Sumit,
I think random read/write will always outperform sequential read/write in Ceph if we don’t have any kind of cache in front or you have proper striping enabled in the image. The reason is the following.
1. If you are trying with the default image option, the object size is 4 MB and the stripe size = 4MB and stripe unit = 1.
2. You didn’t mention your write size, so, if it is less than 4 MB , 2 seq write will always land on a same PG and it will be serialized within the OSD.
3. But, if we have 2 random writes it will always (more probable) to land on different PGs and it will be processed in parallel.
4. Same will happen in case of random vs seq read as well. Increasing read_ahead_kb to a reasonable big number will improve the seq read speed. If you are using librbd, rbd_cache will help you both for read/write I guess.
5. Another option you may want to try to set the strip_size/object_size/stripe_unit to your io_size so that seq read/write can land on different object and in that case the difference should go away.
Hope this is helpful.
Thanks & Regards
Somnath
From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of Sumit Gaur
Sent: Sunday, February 01, 2015 6:55 PM
To: Florent MONTHEL
Cc: ceph-users@xxxxxxxxxxxxxx
Subject: Re: ceph Performance random write is more then sequential
Hi All,
What I saw after enabling RBD cache it is working as expected, means sequential write has better MBps than random write. can somebody explain this behaviour ? Is RBD cache setting must for ceph cluster to behave normally ?
Thanks
sumit
On Mon, Feb 2, 2015 at 9:59 AM, Sumit Gaur <sumitkgaur@xxxxxxxxx> wrote:
Hi Florent,
Cache tiering , No .
** Our Architecture :
vdbench/FIO inside VM <--> RBD without cache <-> Ceph Cluster (6 OSDs + 3 Mons)
Thanks
sumit
[root@ceph-mon01 ~]# ceph -s
cluster 47b3b559-f93c-4259-a6fb-97b00d87c55a
health HEALTH_WARN clock skew detected on mon.ceph-mon02, mon.ceph-mon03
monmap e1: 3 mons at {ceph-mon01=192.168.10.19:6789/0,ceph-mon02=192.168.10.20:6789/0,ceph-mon03=192.168.10.21:6789/0}, election epoch 14, quorum 0,1,2 ceph-mon01,ceph-mon02,ceph-mon03
osdmap e603: 36 osds: 36 up, 36 in
pgmap v40812: 5120 pgs, 2 pools, 179 GB data, 569 kobjects
522 GB used, 9349 GB / 9872 GB avail
5120 active+clean
On Mon, Feb 2, 2015 at 12:21 AM, Florent MONTHEL <fmonthel@xxxxxxxxxxxxx> wrote:
Hi Sumit
Do you have cache pool tiering activated ?
Some feed-back regarding your architecture ?
Thanks
Sent from my iPad
> On 1 févr. 2015, at 15:50, Sumit Gaur <sumitkgaur@xxxxxxxxx> wrote:
>
> Hi
> I have installed 6 node ceph cluster and to my surprise when I ran rados bench I saw that random write has more performance number then sequential write. This is opposite to normal disk write. Can some body let me know if I am missing any ceph Architecture point here ?
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com