This operation stalled quite a bit, seems that it was waiting for the osd: 2.547155 7f036ffc7700 1 -- 10.102.4.11:0/1009401 --> 10.102.4.14:6809/7428 -- osd_op(client.78418684.0:27514711 .bucket.meta.<CNT-UUID-FINDME>:default.78418684.122043 [call version.read,getxattrs,stat] 5.3b7d1197 ack+read e16034) v4 -- ?+0 0x7f026802e2c0 con 0x7f040c055ca0 ... 7.619750 7f041ddf4700 1 -- 10.102.4.11:0/1009401 <== osd.32 10.102.4.14:6809/7428 208252 ==== osd_op_reply(27514711 .bucket.meta.<CNT-UUID-FINDME>:default.78418684.122043 [call,getxattrs,stat] v0'0 uv6371 ondisk = 0) v6 ==== 338+0+336 (3685145659 0 4232894755) 0x7f00e430f540 con 0x7f040c055ca0 By looking at these logs it seems that there are only 8 pgs on the .rgw pool, if this is correct then you may want to change that considering your workload. Yehuda On Tue, Oct 7, 2014 at 3:46 PM, Daniel Schneller <daniel.schneller@xxxxxxxxxxxxxxxx> wrote: > Hi! > > Sorry, I must have missed the enabling of that debug module. > However, the test setup has been the same all the time - > I only have the one test-application :) > > But maybe I phrased it a bit ambiguously when I wrote > >> It then continuously created containers - both empty >> and such with 10 objects of around 100k random data in them. > > 100 kilobytes is the size of a single object, of which we create 10 > per container. The container gets created first, without any > objects, naturally, then 10 objects are added. One of these objects > is called “version”, the rest have generated names with a fixed > prefix and appended 1-9. The version object is the one I picked > for the example logs I sent earlier. > > I hope this makes the setup clearer. > > Attached you will find the (now more extensive) logs for the outliers > again. As you did not say that I garbled the logs, I assume the > pre-processing was OK, so I have prepared the new data in a similar > fashion, marking the relevant request with <CNT-UUID-FINDME>. > > I have not removed any lines in between the beginning of the > “interesting” request and its completion to keep all the network > traffic log intact. Due to the increased verbosity, I will not post > the logs inline, but only attach them gzipped. > > As before, should the full data set be needed, I can provide > an archived version. > > > > > Thanks for your support! > Daniel > > > > >> On 07 Oct 2014, at 22:45, Yehuda Sadeh <yehuda@xxxxxxxxxx> wrote: >> >> The logs here don't include the messenger (debug ms = 1). It's hard to >> tell what going on from looking at the outliers. Also, in your >> previous mail you described a different benchmark, you tested writing >> large number of objects into a single bucket, whereas in this test >> you're testing multiple bucket creations, which have a completely >> different characteristics. >> >> >> On Tue, Oct 7, 2014 at 1:26 PM, Daniel Schneller >> <daniel.schneller@xxxxxxxxxxxxxxxx> wrote: >>> Hi! >>> >>> I have re-run our test as follows: >>> >>> * 4 Rados Gateways, on 4 baremetal machines which have >>> a total of 48 spinning rust OSDs. >>> >>> * Benchmark run on a virtual machine talking to HAProxy >>> which balances the requests across the 4 Rados GWs. >>> >>> * Three instances of the benchmark run in parallel. Each >>> instance creates 1000 containers, puts 11 objects into >>> each container. Once they have all been created, each >>> instances deletes its own containers again. >>> >>> I configured one of the radosgws with the debug levels >>> you requested. The tests produced quite an amount of data >>> (approx. 1GB of text), so I took the liberty to >>> pre-process that a bit. >>> >>> In this run we landed at around 1.2s per container >>> created (including the objects in them) on average. >>> This was slightly better than the 1.6s we witnessed >>> before, but that test ran for much longer, so there might >>> have been some queue-up effect. >>> >>> Interestingly enough the average is actually somewhat >>> misleading. The logs below show the creation of one >>> object in a container each, one being the fastest of this >>> benchmark (at least on the debug-enabled radosgw), one >>> being the slowest. >>> >>> The fastest one was completed in 0.013s, the longest one >>> took 4.93s(!). >>> >>> The attached logs are cleaned up so that they each show >>> just a single request and replaced longish, but constant >>> information with placeholders. Our container names are >>> of the form “stresstest-xxxxxxxxxxx” which I shortened >>> to “<CNT-UUID>” for brevity. Also, I removed the redundant >>> prefix (date, hour, minute of day). >>> >>> The column before the log level looked like a thread-id. >>> As I focused on a single request, I removed all the lines >>> that did not match the same id, replacing the actual value >>> with “<ID>”. That makes the logs much easier to read and >>> understand. >>> >>> Just in case I might have removed too much information >>> for the logs to be useful, the complete log is available >>> in BZIP2 compressed form for download. Just let me know >>> if you need it, then I will provide a link via direct email. >>> >>> To me it seems like there might indeed be a contention >>> issue. It would be interesting to know, if this is correct >>> and if there are any settings that we could adjust to >>> alleviate the issue. >>> >>> Daniel >>> >>> ============================== >>> >>> ➜ ~ cat rados_shortest.txt >>> 21.431185 <ID> 20 QUERY_STRING=page=swift¶ms=/v1/<CNT-UUID>/version >>> 21.431187 <ID> 20 REMOTE_ADDR=10.102.8.140 >>> 21.431188 <ID> 20 REMOTE_PORT=44007 >>> 21.431189 <ID> 20 REQUEST_METHOD=PUT >>> 21.431190 <ID> 20 REQUEST_SCHEME=https >>> 21.431191 <ID> 20 REQUEST_URI=/swift/v1/<CNT-UUID>/version >>> 21.431192 <ID> 20 SCRIPT_FILENAME=/var/www/s3gw.fcgi >>> 21.431193 <ID> 20 SCRIPT_NAME=/swift/v1/<CNT-UUID>/version >>> 21.431194 <ID> 20 >>> SCRIPT_URI=https://localhost:8405/swift/v1/<CNT-UUID>/version >>> 21.431195 <ID> 20 SCRIPT_URL=/swift/v1/<CNT-UUID>/version >>> 21.431196 <ID> 20 SERVER_ADDR=10.102.9.11 >>> 21.431197 <ID> 20 SERVER_ADMIN=[no address given] >>> 21.431198 <ID> 20 SERVER_NAME=localhost >>> 21.431199 <ID> 20 SERVER_PORT=8405 >>> 21.431200 <ID> 20 SERVER_PORT_SECURE=443 >>> 21.431201 <ID> 20 SERVER_PROTOCOL=HTTP/1.1 >>> 21.431202 <ID> 20 SERVER_SIGNATURE= >>> 21.431203 <ID> 20 SERVER_SOFTWARE=Apache/2.4.7 (Ubuntu) >>> 21.431205 <ID> 1 ====== starting new request req=0x7f038c019e90 ===== >>> 21.431219 <ID> 2 req 980641:0.000015::PUT >>> /swift/v1/<CNT-UUID>/version::initializing >>> 21.431259 <ID> 10 ver=v1 first=<CNT-UUID> req=version >>> 21.431265 <ID> 10 s->object=version s->bucket=<CNT-UUID> >>> 21.431269 <ID> 2 req 980641:0.000065:swift:PUT >>> /swift/v1/<CNT-UUID>/version::getting op >>> 21.431274 <ID> 2 req 980641:0.000070:swift:PUT >>> /swift/v1/<CNT-UUID>/version:put_obj:authorizing >>> 21.431321 <ID> 20 get_obj_state: rctx=0x7f030800b720 >>> obj=.users.swift:documentstore:swift state=0x7f03080f31e8 s->prefetch_data=0 >>> 21.431332 <ID> 10 cache get: name=.users.swift+documentstore:swift : hit >>> 21.431338 <ID> 20 get_obj_state: s->obj_tag was set empty >>> 21.431344 <ID> 10 cache get: name=.users.swift+documentstore:swift : hit >>> 21.431369 <ID> 20 get_obj_state: rctx=0x7f030800b720 >>> obj=.users.uid:documentstore state=0x7f03080f31e8 s->prefetch_data=0 >>> 21.431374 <ID> 10 cache get: name=.users.uid+documentstore : hit >>> 21.431378 <ID> 20 get_obj_state: s->obj_tag was set empty >>> 21.431382 <ID> 10 cache get: name=.users.uid+documentstore : hit >>> 21.431401 <ID> 10 swift_user=documentstore:swift >>> 21.431416 <ID> 20 build_token >>> token=13000000646f63756d656e7473746f72653a737769667406a4b2ba3999f8a84f45355438d8ff17 >>> 21.431467 <ID> 2 req 980641:0.000262:swift:PUT >>> /swift/v1/<CNT-UUID>/version:put_obj:reading permissions >>> 21.431493 <ID> 20 get_obj_state: rctx=0x7f03837ed250 obj=.rgw:<CNT-UUID> >>> state=0x7f03080f31e8 s->prefetch_data=0 >>> 21.431508 <ID> 10 cache get: name=.rgw+<CNT-UUID> : type miss (requested=22, >>> cached=19) >>> 21.433081 <ID> 10 cache put: name=.rgw+<CNT-UUID> >>> 21.433106 <ID> 10 removing entry: >>> name=.rgw+stresstest-ab9ee3e2-dcf5-4a5b-ab40-931d94c7784038242 from cache >>> LRU >>> 21.433114 <ID> 10 moving .rgw+<CNT-UUID> to cache LRU end >>> 21.433120 <ID> 20 get_obj_state: s->obj_tag was set empty >>> 21.433122 <ID> 20 Read xattr: user.rgw.idtag >>> 21.433124 <ID> 20 Read xattr: user.rgw.manifest >>> 21.433129 <ID> 10 cache get: name=.rgw+<CNT-UUID> : hit >>> 21.433141 <ID> 20 rgw_get_bucket_info: bucket instance: >>> <CNT-UUID>(@{i=.rgw.buckets.index}.rgw.buckets[default.78418684.118911]) >>> 21.433148 <ID> 20 reading from >>> .rgw:.bucket.meta.<CNT-UUID>:default.78418684.118911 >>> 21.433169 <ID> 20 get_obj_state: rctx=0x7f03837ed250 >>> obj=.rgw:.bucket.meta.<CNT-UUID>:default.78418684.118911 >>> state=0x7f0308005778 s->prefetch_data=0 >>> 21.433185 <ID> 10 cache get: >>> name=.rgw+.bucket.meta.<CNT-UUID>:default.78418684.118911 : type miss >>> (requested=22, cached=19) >>> 21.434632 <ID> 10 cache put: >>> name=.rgw+.bucket.meta.<CNT-UUID>:default.78418684.118911 >>> 21.434650 <ID> 10 moving >>> .rgw+.bucket.meta.<CNT-UUID>:default.78418684.118911 to cache LRU end >>> 21.434657 <ID> 10 updating xattr: name=user.rgw.acl bl.length()=177 >>> 21.434664 <ID> 20 get_obj_state: s->obj_tag was set empty >>> 21.434667 <ID> 20 Read xattr: user.rgw.acl >>> 21.434668 <ID> 20 Read xattr: user.rgw.idtag >>> 21.434669 <ID> 20 Read xattr: user.rgw.manifest >>> 21.434675 <ID> 10 cache get: >>> name=.rgw+.bucket.meta.<CNT-UUID>:default.78418684.118911 : hit >>> 21.434701 <ID> 15 Read AccessControlPolicy<AccessControlPolicy >>> xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>documentstore</ID><DisplayName>Document >>> Store</DisplayName></Owner><AccessControlList><Grant><Grantee >>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >>> xsi:type="CanonicalUser"><ID>documentstore</ID><DisplayName>Document >>> Store</DisplayName></Grantee><Permission>FULL_CONTROL</Permission></Grant></AccessControlList></AccessControlPolicy> >>> 21.434710 <ID> 2 req 980641:0.003506:swift:PUT >>> /swift/v1/<CNT-UUID>/version:put_obj:init op >>> 21.434716 <ID> 2 req 980641:0.003512:swift:PUT >>> /swift/v1/<CNT-UUID>/version:put_obj:verifying op mask >>> 21.434719 <ID> 20 required_mask= 2 user.op_mask=7 >>> 21.434720 <ID> 2 req 980641:0.003516:swift:PUT >>> /swift/v1/<CNT-UUID>/version:put_obj:verifying op permissions >>> 21.434723 <ID> 5 Searching permissions for uid=documentstore mask=50 >>> 21.434725 <ID> 5 Found permission: 15 >>> 21.434726 <ID> 5 Searching permissions for group=1 mask=50 >>> 21.434727 <ID> 5 Permissions for group not found >>> 21.434729 <ID> 5 Searching permissions for group=2 mask=50 >>> 21.434730 <ID> 5 Permissions for group not found >>> 21.434731 <ID> 5 Getting permissions id=documentstore owner=documentstore >>> perm=2 >>> 21.434732 <ID> 10 uid=documentstore requested perm (type)=2, policy perm=2, >>> user_perm_mask=2, acl perm=2 >>> 21.434734 <ID> 2 req 980641:0.003530:swift:PUT >>> /swift/v1/<CNT-UUID>/version:put_obj:verifying op params >>> 21.434737 <ID> 2 req 980641:0.003533:swift:PUT >>> /swift/v1/<CNT-UUID>/version:put_obj:executing >>> 21.434850 <ID> 20 get_obj_state: rctx=0x7f03837ed250 obj=<CNT-UUID>:version >>> state=0x7f03080c5378 s->prefetch_data=0 >>> 21.436144 <ID> 10 setting object write_tag=default.78418684.980641 >>> 21.444613 <ID> 2 req 980641:0.013408:swift:PUT >>> /swift/v1/<CNT-UUID>/version:put_obj:http status=201 >>> >>> ============================== >>> >>> ➜ ~ cat rados_longest.txt >>> 31.886128 <ID> 20 QUERY_STRING=page=swift¶ms=/v1/<CNT-UUID>/version >>> 31.886129 <ID> 20 REMOTE_ADDR=10.102.8.140 >>> 31.886130 <ID> 20 REMOTE_PORT=46714 >>> 31.886131 <ID> 20 REQUEST_METHOD=PUT >>> 31.886132 <ID> 20 REQUEST_SCHEME=https >>> 31.886134 <ID> 20 REQUEST_URI=/swift/v1/<CNT-UUID>/version >>> 31.886135 <ID> 20 SCRIPT_FILENAME=/var/www/s3gw.fcgi >>> 31.886136 <ID> 20 SCRIPT_NAME=/swift/v1/<CNT-UUID>/version >>> 31.886137 <ID> 20 >>> SCRIPT_URI=https://localhost:8405/swift/v1/<CNT-UUID>/version >>> 31.886138 <ID> 20 SCRIPT_URL=/swift/v1/<CNT-UUID>/version >>> 31.886139 <ID> 20 SERVER_ADDR=10.102.9.11 >>> 31.886140 <ID> 20 SERVER_ADMIN=[no address given] >>> 31.886141 <ID> 20 SERVER_NAME=localhost >>> 31.886143 <ID> 20 SERVER_PORT=8405 >>> 31.886144 <ID> 20 SERVER_PORT_SECURE=443 >>> 31.886145 <ID> 20 SERVER_PROTOCOL=HTTP/1.1 >>> 31.886146 <ID> 20 SERVER_SIGNATURE= >>> 31.886147 <ID> 20 SERVER_SOFTWARE=Apache/2.4.7 (Ubuntu) >>> 31.886148 <ID> 1 ====== starting new request req=0x7f038c024c50 ===== >>> 31.886162 <ID> 2 req 983095:0.000013::PUT >>> /swift/v1/<CNT-UUID>/version::initializing >>> 31.886195 <ID> 10 ver=v1 first=<CNT-UUID> req=version >>> 31.886200 <ID> 10 s->object=version s->bucket=<CNT-UUID> >>> 31.886203 <ID> 2 req 983095:0.000055:swift:PUT >>> /swift/v1/<CNT-UUID>/version::getting op >>> 31.886208 <ID> 2 req 983095:0.000060:swift:PUT >>> /swift/v1/<CNT-UUID>/version:put_obj:authorizing >>> 31.886242 <ID> 20 get_obj_state: rctx=0x7f02b0062770 >>> obj=.users.swift:documentstore:swift state=0x7f02b007ac18 s->prefetch_data=0 >>> 31.886250 <ID> 10 cache get: name=.users.swift+documentstore:swift : hit >>> 31.886255 <ID> 20 get_obj_state: s->obj_tag was set empty >>> 31.886260 <ID> 10 cache get: name=.users.swift+documentstore:swift : hit >>> 31.886297 <ID> 20 get_obj_state: rctx=0x7f02b0062770 >>> obj=.users.uid:documentstore state=0x7f02b007ac18 s->prefetch_data=0 >>> 31.886303 <ID> 10 cache get: name=.users.uid+documentstore : hit >>> 31.886308 <ID> 20 get_obj_state: s->obj_tag was set empty >>> 31.886312 <ID> 10 cache get: name=.users.uid+documentstore : hit >>> 31.886329 <ID> 10 swift_user=documentstore:swift >>> 31.886343 <ID> 20 build_token >>> token=13000000646f63756d656e7473746f72653a737769667410f2006ed65dbbaa4f453554207a0f1f >>> 31.886393 <ID> 2 req 983095:0.000245:swift:PUT >>> /swift/v1/<CNT-UUID>/version:put_obj:reading permissions >>> 31.886419 <ID> 20 get_obj_state: rctx=0x7f0378fd8250 obj=.rgw:<CNT-UUID> >>> state=0x7f02b007ac18 s->prefetch_data=0 >>> 31.886430 <ID> 10 cache get: name=.rgw+<CNT-UUID> : type miss (requested=22, >>> cached=19) >>> 36.746327 <ID> 10 cache put: name=.rgw+<CNT-UUID> >>> 36.746404 <ID> 10 moving .rgw+<CNT-UUID> to cache LRU end >>> 36.746426 <ID> 20 get_obj_state: s->obj_tag was set empty >>> 36.746431 <ID> 20 Read xattr: user.rgw.idtag >>> 36.746433 <ID> 20 Read xattr: user.rgw.manifest >>> 36.746452 <ID> 10 cache get: name=.rgw+<CNT-UUID> : hit >>> 36.746481 <ID> 20 rgw_get_bucket_info: bucket instance: >>> <CNT-UUID>(@{i=.rgw.buckets.index}.rgw.buckets[default.78418684.119116]) >>> 36.746491 <ID> 20 reading from >>> .rgw:.bucket.meta.<CNT-UUID>:default.78418684.119116 >>> 36.746549 <ID> 20 get_obj_state: rctx=0x7f0378fd8250 >>> obj=.rgw:.bucket.meta.<CNT-UUID>:default.78418684.119116 >>> state=0x7f02b00ce638 s->prefetch_data=0 >>> 36.746585 <ID> 10 cache get: >>> name=.rgw+.bucket.meta.<CNT-UUID>:default.78418684.119116 : type miss >>> (requested=22, cached=19) >>> 36.747938 <ID> 10 cache put: >>> name=.rgw+.bucket.meta.<CNT-UUID>:default.78418684.119116 >>> 36.747955 <ID> 10 moving >>> .rgw+.bucket.meta.<CNT-UUID>:default.78418684.119116 to cache LRU end >>> 36.747963 <ID> 10 updating xattr: name=user.rgw.acl bl.length()=177 >>> 36.747972 <ID> 20 get_obj_state: s->obj_tag was set empty >>> 36.747975 <ID> 20 Read xattr: user.rgw.acl >>> 36.747977 <ID> 20 Read xattr: user.rgw.idtag >>> 36.747978 <ID> 20 Read xattr: user.rgw.manifest >>> 36.747985 <ID> 10 cache get: >>> name=.rgw+.bucket.meta.<CNT-UUID>:default.78418684.119116 : hit >>> 36.748025 <ID> 15 Read AccessControlPolicy<AccessControlPolicy >>> xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>documentstore</ID><DisplayName>Document >>> Store</DisplayName></Owner><AccessControlList><Grant><Grantee >>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >>> xsi:type="CanonicalUser"><ID>documentstore</ID><DisplayName>Document >>> Store</DisplayName></Grantee><Permission>FULL_CONTROL</Permission></Grant></AccessControlList></AccessControlPolicy> >>> 36.748037 <ID> 2 req 983095:4.861888:swift:PUT >>> /swift/v1/<CNT-UUID>/version:put_obj:init op >>> 36.748043 <ID> 2 req 983095:4.861895:swift:PUT >>> /swift/v1/<CNT-UUID>/version:put_obj:verifying op mask >>> 36.748046 <ID> 20 required_mask= 2 user.op_mask=7 >>> 36.748050 <ID> 2 req 983095:4.861902:swift:PUT >>> /swift/v1/<CNT-UUID>/version:put_obj:verifying op permissions >>> 36.748054 <ID> 5 Searching permissions for uid=documentstore mask=50 >>> 36.748056 <ID> 5 Found permission: 15 >>> 36.748058 <ID> 5 Searching permissions for group=1 mask=50 >>> 36.748060 <ID> 5 Permissions for group not found >>> 36.748061 <ID> 5 Searching permissions for group=2 mask=50 >>> 36.748063 <ID> 5 Permissions for group not found >>> 36.748064 <ID> 5 Getting permissions id=documentstore owner=documentstore >>> perm=2 >>> 36.748066 <ID> 10 uid=documentstore requested perm (type)=2, policy perm=2, >>> user_perm_mask=2, acl perm=2 >>> 36.748069 <ID> 2 req 983095:4.861921:swift:PUT >>> /swift/v1/<CNT-UUID>/version:put_obj:verifying op params >>> 36.748072 <ID> 2 req 983095:4.861924:swift:PUT >>> /swift/v1/<CNT-UUID>/version:put_obj:executing >>> 36.748200 <ID> 20 get_obj_state: rctx=0x7f0378fd8250 obj=<CNT-UUID>:version >>> state=0x7f02b0042618 s->prefetch_data=0 >>> 36.802077 <ID> 10 setting object write_tag=default.78418684.983095 >>> 36.818727 <ID> 2 req 983095:4.932579:swift:PUT >>> /swift/v1/<CNT-UUID>/version:put_obj:http status=201 >>> >>> ============================== >>> >>> >>> -- >>> Daniel Schneller >>> Mobile Development Lead >>> >>> CenterDevice GmbH | Merscheider Straße 1 >>> | 42699 Solingen >>> tel: +49 1754155711 | Deutschland >>> daniel.schneller@xxxxxxxxxxxxxxxx | www.centerdevice.com >>> >>> >>> >>> >>> On 06 Oct 2014, at 19:26, Yehuda Sadeh <yehuda@xxxxxxxxxx> wrote: >>> >>> It'd be interesting to see which rados operation is slowing down the >>> requests. Can you provide a log dump of a request (with 'debug rgw = >>> 20', and 'debug ms = 1'). This might give us a better idea as to >>> what's going on. >>> >>> Thanks, >>> Yehuda >>> >>> On Mon, Oct 6, 2014 at 10:05 AM, Daniel Schneller >>> <daniel.schneller@xxxxxxxxxxxxxxxx> wrote: >>> >>> Hi again! >>> >>> We have done some tests regarding the limits of storing lots and >>> lots of buckets through Rados Gateway into Ceph. >>> >>> Our test used a single user for which we removed the default max >>> buckets limit. It then continuously created containers - both empty >>> and such with 10 objects of around 100k random data in them. >>> >>> With 3 parallel processes we saw relatively consistent time of >>> about 500-700ms per such container. >>> >>> This kept steady until we reached approx. 3 million containers >>> after which the time per insert sharply went up to currently >>> around 1600ms and rising. Due to some hiccups with network >>> equipment the tests were aborted a few times, but then resumed without >>> deleting any of the previous runs created containers, so the actual >>> number might be 2.8 or 3.2 million, but still in that ballpark. >>> We aborted the test here. >>> >>> Judging by the advice given earlier (see quoted mail below) that >>> we might hit a limit on some per-user data structures, we created >>> another user account, removed its max-bucket limit as well and >>> restarted the benchmark with that one, _expecting_ the times to be >>> down to the original range of 500-700ms. >>> >>> However, what we are seeing is that the times stay at the 1600ms >>> and higher levels even for that fresh account. >>> >>> Here is the output of `rados df`, reformatted to fit the email. >>> clones, degraded and unfound were 0 in all cases and have been >>> left out for clarity: >>> >>> .rgw >>> ========================= >>> KB: 1,966,932 >>> objects: 9,094,552 >>> rd: 195,747,645 >>> rd KB: 153,585,472 >>> wr: 30,191,844 >>> wr KB: 10,751,065 >>> >>> .rgw.buckets >>> ========================= >>> KB: 2,038,313,855 >>> objects: 22,088,103 >>> rd: 5,455,123 >>> rd KB: 408,416,317 >>> wr: 149,377,728 >>> wr KB: 1,882,517,472 >>> >>> .rgw.buckets.index >>> ========================= >>> KB: 0 >>> objects: 5,374,376 >>> rd: 267,996,778 >>> rd KB: 262,626,106 >>> wr: 107,142,891 >>> wr KB: 0 >>> >>> .rgw.control >>> ========================= >>> KB: 0 >>> objects: 8 >>> rd: 0 >>> rd KB: 0 >>> wr: 0 >>> wr KB: 0 >>> >>> .rgw.gc >>> ========================= >>> KB: 0 >>> objects: 32 >>> rd: 5,554,407 >>> rd KB: 5,713,942 >>> wr: 8,355,934 >>> wr KB: 0 >>> >>> .rgw.root >>> ========================= >>> KB: 1 >>> objects: 3 >>> rd: 524 >>> rd KB: 346 >>> wr: 3 >>> wr KB: 3 >>> >>> >>> We would very much like to understand what is going on here >>> in order to decide if Rados Gateway is a viable option to base >>> our production system on (where we expect similar counts >>> as in the benchmark), or if we need to investigate using librados >>> directly which we would like to avoid if possible. >>> >>> Any advice on what configuration parameters to check or >>> which additional information to provide to analyze this would be >>> very much welcome. >>> >>> Cheers, >>> Daniel >>> >>> >>> -- >>> Daniel Schneller >>> Mobile Development Lead >>> >>> CenterDevice GmbH | Merscheider Straße 1 >>> | 42699 Solingen >>> tel: +49 1754155711 | Deutschland >>> daniel.schneller@xxxxxxxxxxxxxxxx | www.centerdevice.com >>> >>> >>> >>> >>> On 10 Sep 2014, at 19:42, Gregory Farnum <greg@xxxxxxxxxxx> wrote: >>> >>> On Wednesday, September 10, 2014, Daniel Schneller >>> <daniel.schneller@xxxxxxxxxxxxxxxx> wrote: >>> >>> >>> On 09 Sep 2014, at 21:43, Gregory Farnum <greg@xxxxxxxxxxx> wrote: >>> >>> >>> Yehuda can talk about this with more expertise than I can, but I think >>> it should be basically fine. By creating so many buckets you're >>> decreasing the effectiveness of RGW's metadata caching, which means >>> >>> the initial lookup in a particular bucket might take longer. >>> >>> >>> Thanks for your thoughts. With “initial lookup in a particular bucket” >>> do you mean accessing any of the objects in a bucket? If we directly >>> access the object (not enumerating the buckets content), would that >>> still be an issue? >>> Just trying to understand the inner workings a bit better to make >>> more educated guesses :) >>> >>> >>> >>> When doing an object lookup, the gateway combines the "bucket ID" with a >>> mangled version of the object name to try and do a read out of RADOS. It >>> first needs to get that bucket ID though -- it will cache an the bucket >>> name->ID mapping, but if you have a ton of buckets there could be enough >>> entries to degrade the cache's effectiveness. (So, you're more likely to pay >>> that extra disk access lookup.) >>> >>> >>> >>> >>> The big concern is that we do maintain a per-user list of all their >>> buckets — which is stored in a single RADOS object — so if you have an >>> extreme number of buckets that RADOS object could get pretty big and >>> become a bottleneck when creating/removing/listing the buckets. You >>> >>> >>> Alright. Listing buckets is no problem, that we don’t do. Can you >>> say what “pretty big” would be in terms of MB? How much space does a >>> bucket record consume in there? Based on that I could run a few numbers. >>> >>> >>> >>> Uh, a kilobyte per bucket? You could look it up in the source (I'm on my >>> phone) but I *believe* the bucket name is allowed to be larger than the rest >>> combined... >>> More particularly, though, if you've got a single user uploading documents, >>> each creating a new bucket, then those bucket creates are going to serialize >>> on this one object. >>> -Greg >>> >>> >>> >>> >>> should run your own experiments to figure out what the limits are >>> there; perhaps you have an easy way of sharding up documents into >>> different users. >>> >>> >>> Good advice. We can do that per distributor (an org unit in our >>> software) to at least compartmentalize any potential locking issues >>> in this area to that single entity. Still, there would be quite >>> a lot of buckets/objects per distributor, so some more detail on >>> the above items would be great. >>> >>> Thanks a lot! >>> >>> >>> Daniel >>> >>> >>> >>> >>> -- >>> Software Engineer #42 @ http://inktank.com | http://ceph.com >>> >>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com