On Fri, Sep 20, 2013 at 1:50 PM, Matt Thompson <wateringcan@xxxxxxxxx> wrote: > > Hi Yehuda / Mark, > > Thanks for the information! We will try keystone authentication again when the next dumpling dot release is out. > > As for "ceph cache", are you referring to "rgw_cache_enabled"? If so, we don't have that set in our ceph.conf so should in theory be using it already. I actually meant to say ceph authentication, not ceph cache. Yehuda > > > > On Fri, Sep 20, 2013 at 3:57 PM, Yehuda Sadeh <yehuda@xxxxxxxxxxx> wrote: >> >> On Fri, Sep 20, 2013 at 3:51 AM, Matt Thompson <wateringcan@xxxxxxxxx> wrote: >> > Hi Yehuda, >> > >> > I did try bumping up pg_num on .rgw, .rgw.buckets, and .rgw.buckets.index >> > from 8 to 220 prior to writing to the list but when I saw no difference in >> > performance I set back to 8 (by creating new pools etc.) >> > >> > One thing we have since noticed is that radosgw is validating tokens on each >> > request; when we use ceph authentication instead we see much more promising >> > results from swift-bench. >> > >> > Is there a known issue w/ keystone token caching in radosgw? It's my >> > understanding that 10,000 tokens should be cached by default, however this >> > doesn't appear to be the case. I've explicitly set >> > rgw_keystone_token_cache_size in /etc/ceph/ceph.conf on my radosgw node yet >> > radosgw continues to hit keystone on each request. >> > >> >> Looking at the code now I think I see the culprit. It's something that >> was actually fixed in recent versions, but not there in dumpling. I >> opened a ticket for it (6360) and I'll prepare a fix that will >> hopefully make it to the next dumpling dot release. In the mean time >> the way to go would be by using the ceph cache. >> >> > Additionally, what does /var/lib/ceph/radosgw/ceph-radosgw.gateway get used >> > for? I see the docs mention that it needs to be created, yet it remains >> > unpopulated on my nodes and doing a quick scan of ceph code I see no >> > reference to that being used anywhere (thought I may be missing something). >> >> That looks like a ceph generic directory that can be used to put your >> specific user's keyring file (but I might be wrong). >> >> >> > >> > Thanks again for the help! >> > >> > -Matt >> > >> > >> > >> > On Thu, Sep 19, 2013 at 5:01 PM, Yehuda Sadeh <yehuda@xxxxxxxxxxx> wrote: >> >> >> >> On Thu, Sep 19, 2013 at 8:52 AM, Matt Thompson <wateringcan@xxxxxxxxx> >> >> wrote: >> >> > Hi All, >> >> > >> >> > We're trying to test swift API performance of swift itself (1.9.0) and >> >> > ceph's radosgw (0.67.3) using the following hardware configuration: >> >> > >> >> > Shared servers: >> >> > >> >> > * 1 server running keystone for authentication >> >> > * 1 server running swift-proxy, a single MON, and radosgw + Apache / >> >> > FastCGI >> >> > >> >> > Ceph: >> >> > >> >> > * 4 storage servers, 5 storage disks / 5 OSDs on each (no separate >> >> > disk(s) >> >> > for journal) >> >> > >> >> > Swift: >> >> > >> >> > * 4 storage servers, 5 storage disks on each >> >> > >> >> > All 10 machines have identical hardware configurations (including drive >> >> > type >> >> > & speed). >> >> > >> >> > We deployed ceph w/ ceph-deploy and both swift and ceph have default >> >> > configurations w/ the exception of the following: >> >> > >> >> > * custom Inktank packages for apache2 / libapache2-mod-fastcgi >> >> > * rgw_print_continue enabled >> >> > * rgw_enable_ops_log disabled >> >> > * rgw_ops_log_rados disabled >> >> > * debug_rgw disabled >> >> > >> >> > (actually, swift was deployed w/ a chef cookbook, so configurations may >> >> > be >> >> > slightly non-standard) >> >> > >> >> > On the ceph storage servers, filesystem type (XFS) and filesystem mount >> >> > options, pg_nums on pools, etc. have all been left with the defaults (8 >> >> > on >> >> > the radosgw-related pools IIRC). >> >> >> >> 8 pgs per pool, especially for the data / index pools is awfully low, >> >> and probably where your bottleneck is. >> >> >> >> > >> >> > Doing a preliminary test w/ swift-bench (concurrency = 10, object_size = >> >> > 1), >> >> > we're seeing the following: >> >> > >> >> > Ceph: >> >> > >> >> > 1000 PUTS **FINAL** [0 failures], 14.8/s >> >> > 10000 GETS **FINAL** [0 failures], 40.9/s >> >> > 1000 DEL **FINAL** [0 failures], 34.6/s >> >> > >> >> > Swift: >> >> > >> >> > 1000 PUTS **FINAL** [0 failures], 21.7/s >> >> > 10000 GETS **FINAL** [0 failures], 139.5/s >> >> > 1000 DEL **FINAL** [0 failures], 85.5/s >> >> > >> >> > That's a relatively significant difference. Would we see any real >> >> > difference in moving the journals to an SSD per server or separate >> >> > partition >> >> > per OSD disk? These machines are not seeing any load short of what's >> >> > being >> >> >> >> maybe, but I think at this point you're hitting the low pgs issue. >> >> >> >> > generated by swift-bench. Alternatively, would we see any quick wins >> >> > standing up more MONs or moving the MON off the server running radosgw + >> >> > Apache / FastCGI? >> >> >> >> don't think it's going to make much of a difference right now. >> >> >> >> Yehuda >> > >> > > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com