Re: AW: AW: radosrgw performance problems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I spoke to Yehuda (who develops RGW), and he mentioned that it may be latency due to SSL handshake. How big are the objects you are writing?

with RBD, I can do much better than 40% of the rados throughput, but it takes a lot of concurrency. I use fio with libaio, direct=1, 4MB writes, and a high iodepth on multiple volumes to get there. Btw, rados bench by default is going to keep 16 objects in flight too.

Mark

On 06/12/2013 10:14 AM, Jäger, Philipp wrote:
No, not really:

30239 www-data  20   0  751m 7556 2036 S   10  0.4   0:03.67 apache2
  1955 root      20   0 2048m  10m 4352 S    6  0.5   2:14.54 radosgw

10% cpu usage apache, load was 0.4. Also less than 15% usage via vcenter performance graph.
We are setting up a physical server right now at this moment, because we thought also about missing instruction sets of the cpu.

Another perf question:

As I said we can write about 170mb/s with the rados bench:
rados bench -p test 100 write:
Bandwidth (MB/sec):     171.744.

With rbd or rgw (w/o https) we get less than 40mb/s:
(time rados -p connect put 600mb.iso 600mb.iso
real    0m15.846s
  user    0m0.640s
sys     0m0.836s)

I think you can also close the bug! Because of https I have to type the protocol, and then I get the message "bad protocol"( like here: http://tracker.ceph.com/issues/3968 ) , so not possible to bench at the moment with rest-bench.

Ive configured the admin socket, but I don't know who to "read" the output of perfcounters in a sensefull way...


Thank you very much so far.

Philipp

-----Ursprüngliche Nachricht-----
Von: Mark Nelson [mailto:mark.nelson@xxxxxxxxxxx]
Gesendet: Mittwoch, 12. Juni 2013 16:53
An: Jäger, Philipp
Cc: ceph-devel@xxxxxxxxxxxxxxx
Betreff: Re: AW: radosrgw performance problems

Interesting.  Was Apache using excessive CPU?  Do your processors and libraries support AES-NI?  Seems strange that at this level that would be the limiting factor, but I've seen stranger things...  Glad you figured it out!

Mark

On 06/12/2013 05:52 AM, Jäger, Philipp wrote:
Hello,

identified the problem.

When I deactivate SSL in Apache Config, and connect via http, I get
the 40MB/s. (with ssl 8mb/s) Have you experience with SSL? Is this normal?

Thanks

Regards



-----Ursprüngliche Nachricht-----
Von: Jäger, Philipp
Gesendet: Mittwoch, 12. Juni 2013 10:22
An: 'Mark Nelson'
Cc: ceph-devel@xxxxxxxxxxxxxxx
Betreff: AW: radosrgw performance problems

Hello,

i've added my answers below.

Thanks

Regards

Philipp

-----Ursprüngliche Nachricht-----
Von: Mark Nelson [mailto:mark.nelson@xxxxxxxxxxx]
Gesendet: Dienstag, 11. Juni 2013 16:38
An: Jäger, Philipp
Cc: ceph-devel@xxxxxxxxxxxxxxx
Betreff: Re: radosrgw performance problems

On 06/11/2013 08:27 AM, Jäger, Philipp wrote:
Hello,

we have a performance problem with radosrgw.
Only 8mb/s-9 per upload, also tested with s3cmd on the rgw itself.
(2 uploads at the same time: combined 15mb/s, 3 uploads at the same
time: comb. 21mb/s) But when putting a file via rados rbd , we get 40mb/s upload, so no network or other problem in general.

One thing to check is to make sure that the rgw pool you are writing to has enough placement groups for your cluster.  The default may be extremely low.

[Philipp] We don't use standard pool, new pool with 1500pg, same
problem. (30 osds)


Same speed with the inktank apache/fastcgi and the original one.
Hardware also fast enough. We use Ubuntu 12.04 lts, ceph 0.61.2

So have you any idea why the rgw is so slow? How can we identify where the problem is?

RBD is pretty streamlined so you can get good performance with it.  On my test setup I'm seeing 80-90% of the performance of raw rados object writes/reads (and in some cases much faster with RBD cache enabled!).
RGW, Apache, fastcgi, and simply the requirements of supporting the S3 protocol itself add a lot of overhead.  MD5 calculations by themselves start chewing up a ton of CPU once you try to support high throughput scenarios and there is a non-trivial amount of extra latency added as well.  You may be able to improve things with some tweaks, but I wouldn't be surprised if RBD is always going to be faster to an extent.

[Philipp]We are talking about 9mb/s per rgw, which is less then 1/4 of rbd (rados put: 40mb/s), with the rados bench we get actually: Bandwidth (MB/sec):     171.744.
So I think we are not talking about tweaking, rather a general problem?


For folks who want really fast object storage I think directly utilizing rados is probably the way to go, but that requires modifying the app and it's not for everyone.


(I've heard something about the rgw admin socket to check
perfcounters, but it seems that this is deprecated? Because when i
type ceph --admin-daemon ... it says unknown command and I cannot
find it in the ceph docu. Then i wanted to bench via rest-bench, but
it says "ERROR: failed to create bucket: XmlParseFailure -failed
initializing benchmark", so I could not bench the speed.)

connecting with the admin daemon should still be supported.
Documentation is here:

http://ceph.com/docs/next/radosgw/troubleshooting/

If this doesn't work please let me know!

[Philipp] How can you activate a rgw admin socket? I think we have to add an entry in the ceph.conf?  The admin socket is not the "rgw socket path" I think?


Also, I've created a bug for the rest-bench issue:

http://tracker.ceph.com/issues/5302

Personally I've been using swift-bench for most of my recent rgw testing.

Mark


Ceph.conf- rgw part:

[client.radosgw.connect2]
host = hcrgwko2
rgw socket path = /tmp/connect2.sock
log file = /var/log/ceph/connect2.log rgw dns name =  FQDN

Thank you very much.


Regards

Philipp
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo
info at  http://vger.kernel.org/majordomo-info.html




--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux