Wade, I'm having the same problem as you do. We have currently 5+ million objects in a bucket and it is not even sharded, so we observe many problems with that. Did you manage to test RGW with tons of files? 2016-05-24 2:45 GMT+03:00 Wade Holler <wade.holler@xxxxxxxxx>: > We (my customer ) are trying to test at Jewell now but I can say that the > above behavior was also observed by my customer at Infernalis. After 300 > million or so objects in a single bucket the cluster basically fell down as > described above. Few hundred osds in this cluster. We are concerned that > this may not be remedied by a hundreds of buckets approach as well. Testing > continues. > On Mon, May 23, 2016 at 7:35 PM Vickey Singh <vickey.singh22693@xxxxxxxxx> > wrote: >> >> Hello Guys >> >> Is several millions of object with Ceph ( for RGW use case ) still an >> issue ? Or is it fixed ? >> >> Thnx >> Vickey >> >> On Thu, Jan 28, 2016 at 12:55 AM, Krzysztof Księżyk <kksiezyk@xxxxxxxxx> >> wrote: >>> >>> Stefan Rogge <stefan.ceph@...> writes: >>> >>> > >>> > >>> > Hi, >>> > we are using the Ceph with RadosGW and S3 setting. >>> > With more and more objects in the storage the writing speed slows down >>> significantly. With 5 million object in the storage we had a writing >>> speed >>> of 10MS/s. With 10 million objects in the storage its only 5MB/s. >>> > Is this a common issue? >>> > Is the RadosGW suitable for a large amount of objects or would you >>> recommend to not use the RadosGW with these amount of objects? >>> > >>> > Thank you. >>> > >>> > Stefan >>> > >>> > I found also a ticket at the ceph tracker with the same issue: >>> > >>> > >>> > http://tracker.ceph.com/projects/ceph/wiki/Rgw_-_bucket_index_scalability >>> > >>> > _______________________________________________ >>> > ceph-users mailing list >>> > ceph-users@... >>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> > >>> >>> Hi, >>> >>> I'm struggling with the same issue on Ceph 9.2.0. Unfortunately I wasn't >>> aware of it and now the only way to improve things is create new bucket >>> with bucket index shrading or change way our apps store data into >>> buckets. >>> And of course copy tons of data :( In my case also sth happened to >>> leveldb >>> files and now I cannot even run some radosgw-admin commands like: >>> >>> radosgw-admin bucket check -b .... >>> >>> what causes osd daemon flapping and process timeout messages in logs. PGS >>> containing .rgw.bucket.index can't be even backfilled to other osd as >>> osd >>> process dies with messages: >>> >>> [...] >>> > 2016-01-25 15:47:22.700737 7f79fc66d700 1 heartbeat_map is_healthy >>> 'OSD::osd_op_tp thread 0x7f7992c86700' had suicide timed out after 150 >>> > 2016-01-25 15:47:22.702619 7f79fc66d700 -1 common/HeartbeatMap.cc: In >>> function 'bool ceph::HeartbeatMap::_check(const >>> ceph::heartbeat_handle_d*, >>> const char*, time_t)' thread 7f79fc66d700 time 2016-01-25 15:47:22.700751 >>> > common/HeartbeatMap.cc: 81: FAILED assert(0 == "hit suicide timeout") >>> > >>> > ceph version 9.2.0 (bb2ecea240f3a1d525bcb35670cb07bd1f0ca299) >>> > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char >>> const*)+0x85) [0x7f7a019f4be5] >>> > 2: (ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d const*, char >>> const*, long)+0x2d9) [0x7f7a019343b9] >>> > 3: (ceph::HeartbeatMap::is_healthy()+0xd6) [0x7f7a01934bf6] >>> > 4: (ceph::HeartbeatMap::check_touch_file()+0x2c) [0x7f7a019353bc] >>> > 5: (CephContextServiceThread::entry()+0x15b) [0x7f7a01a10dcb] >>> > 6: (()+0x7df5) [0x7f79ffa8fdf5] >>> > 7: (clone()+0x6d) [0x7f79fe3381ad] >>> > >>> > >>> I don't know - maybe it's because number of leveldb files in omap folder >>> (total 5.1GB). Read somewhere that things can be improved by setting >>> 'leveldb_compression' to false and leveldb_compact_on_mount to true but I >>> don't know if these options have any effect in 9.2.0 as they are not >>> documented for this release. Tried with 'leveldb_compression' but without >>> visible effect and wasn't brave enough with trying >>> leveldb_compact_on_mount >>> on production env. But setting it to true on my test 0.94.5 makes osd >>> failing on restart. >>> >>> Kind regards - >>> Krzysztof Księżyk >>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com