Re: more performance issues :(

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Well, that’s not as straight forward..You can’t let journal write go unleash for long time. The main reason is, OSD memory usage will keep increasing and depending on your HW OSD will crash at some point.

You can try tweaking the following..

 

journal_max_write_bytes

journal_max_write_entries

journal_queue_max_ops

journal_queue_max_bytes

filestore_queue_max_ops

filestore_queue_max_bytes

 

Existing code base will also force flush journal if it is more than half full..

 

If you have a test cluster you may want to try the following pull request which should be able to utilize big journal better..But, you need to tweak some config option according to your setup (I never ran it on HDD based set up)..If you are interested , let me know I can help you on that.

 

https://github.com/ceph/ceph/pull/6670

 

Thanks & Regards

Somnath

 

From: Florian Rommel [mailto:florian.rommel@xxxxxxxxxxxxxxx]
Sent: Wednesday, December 30, 2015 2:54 AM
To: Somnath Roy
Cc: Tyler Bishop; ceph-users@xxxxxxxxxxxxxx
Subject: Re: more performance issues :(

 

Hi all, again thanks for all the suggestions..

 

I have now narrowed it down to this problem:

 

Data gets written to journal (SSD), but the journal, when flushing things out to the SATA disks, doesn’t continue taking new writes, it kind of stops until the journal is flushed to SATA. During that time, the data xfer rate drops quite low it almost looks like its the same speed as the sata disks. When the flush is done, you can see an uptake in speed until the next flush.

 

I have 15GB SSD partitions for 2-3 SATA disks on each server. total of 10 OSDs for now. Is there a ceph.conf option to set the journal to fill up before flush or to continue to write to journal while flushing?

 

Thanks,

//florian

 

 

 

On 26 Dec 2015, at 20:46, Somnath Roy <Somnath.Roy@xxxxxxxxxxx> wrote:

 

FYI, osd_op_threads is not used in the main io path anymore (from Giant). I don’t think increasing this will do any good.

If you want to tweak threads in io path play with the following two.

 

osd_op_num_threads_per_shard

osd_op_num_shards

 

But, It may not be the problem with writes..Default value should work just fine..Need some more info..

 

1. Are you using fio-rbd ? If so, try running with rbd_cache = false in the client side ceph.conf and see if that is making any difference.

 

2. What is the block size you are trying with ?

 

3. Check how the SSD is behaving with raw fio o_direct and o_dsync mode with the same block size

 

4. What kind of fio write io profile are you running ? Hope you are doing similar IO profile with benchwrite.

 

5. How many OSDs a single SSD as journal is serving ? How many OSDs total you are running ? what is the replication factor ?

 

6. Hope none of the resources are saturating

 

Thanks & Regards

Somnath

 

 

From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of Tyler Bishop
Sent: Saturday, December 26, 2015 8:38 AM
To: Florian Rommel
Cc: ceph-users@xxxxxxxxxxxxxx
Subject: Re: [ceph-users] more performance issues :(

 

Add this under osd.

 

osd op threads = 8




Restart the osd services and try that.

 

  

 


From: "Florian Rommel" <florian.rommel@xxxxxxxxxxxxxxx>
To: "Wade Holler" <wade.holler@xxxxxxxxx>
Cc: ceph-users@xxxxxxxxxxxxxx
Sent: Saturday, December 26, 2015 4:55:06 AM
Subject: Re: [ceph-users] more performance issues :(

 

Hi, iostat shows all OSDs working when data is benched.  it looks like the culprit is nowhere to be found. If i add SSD journals with the SSDs that we have, even thought they give a much higher result with fio than the SATA drives, the speed of the cluster is exactly the same… 150-180MB/s, while reads max out the 10GBe network with no problem.

rbd benchwrite however gives me NICE throughput… about 500MB /s to start with and then dropping and flattening out at 320MB/s, 90000 IOPs…. so what the hell is going on?.

 

 

if i take the journals off and move them to the disks themselves, same results. Something is really really off with my config i guess, and I need to do some serious troubleshooting to figure this out.

 

Thanks for the help so far .

//Florian

 

 

 

On 24 Dec 2015, at 13:54, Wade Holler <wade.holler@xxxxxxxxx> wrote:

 

Have a look at the iostsat -x 1 1000 output to see what the drives are doing

 

On Wed, Dec 23, 2015 at 4:35 PM Florian Rommel <florian.rommel@xxxxxxxxxxxxxxx> wrote:

Ah, totally forgot the additional details :)

OS is SUSE Enterprise Linux 12.0 with all patches,
Ceph version 0.94.3
4 node cluster with 2x 10GBe networking, one for cluster and one for public network, 1 additional server purely as an admin server.
Test machine is also 10gbe connected

ceph.conf is included:
[global]
fsid = 312e0996-a13c-46d3-abe3-903e0b4a589a
mon_initial_members = ceph-admin, ceph-01, ceph-02, ceph-03, ceph-04
mon_host = 192.168.0.190,192.168.0.191,192.168.0.192,192.168.0.193,192.168.0.194
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true
public network = 192.168.0.0/24
cluster network = 192.168.10.0/24

osd pool default size = 2
[osd]
osd journal size = 2048

Thanks again for any help and merry xmas already .
//F
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

 


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux