Re: RadosGW performance s3 many objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Guys 

Is several millions of object with Ceph ( for RGW use case ) still an issue ?  Or is it fixed ?

Thnx
Vickey

On Thu, Jan 28, 2016 at 12:55 AM, Krzysztof Księżyk <kksiezyk@xxxxxxxxx> wrote:
Stefan Rogge <stefan.ceph@...> writes:

>
>
> Hi,
> we are using the Ceph with RadosGW and S3 setting.
> With more and more objects in the storage the writing speed slows down
significantly. With 5 million object in the storage we had a writing speed
of 10MS/s. With 10 million objects in the storage its only 5MB/s.  
> Is this a common issue?
> Is the RadosGW suitable for a large amount of objects or would you
recommend to not use the RadosGW with these amount of objects?
>
> Thank you.
>
> Stefan
>
> I found also a ticket at the ceph tracker with the same issue:
>
> http://tracker.ceph.com/projects/ceph/wiki/Rgw_-_bucket_index_scalability
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@...
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

Hi,

I'm struggling with the same issue on Ceph 9.2.0. Unfortunately I wasn't
aware of it and now the only way to improve things is create new bucket
with bucket index shrading or change way our apps store data into buckets.
And of course copy tons of data :( In my case also sth happened to leveldb
files and now I cannot even run some radosgw-admin commands like:

radosgw-admin bucket check -b ....

what causes osd daemon flapping and process timeout messages in logs. PGS
containing  .rgw.bucket.index  can't be even backfilled to other osd as osd
process dies with messages:

[...]
> 2016-01-25 15:47:22.700737 7f79fc66d700  1 heartbeat_map is_healthy
'OSD::osd_op_tp thread 0x7f7992c86700' had suicide timed out after 150
> 2016-01-25 15:47:22.702619 7f79fc66d700 -1 common/HeartbeatMap.cc: In
function 'bool ceph::HeartbeatMap::_check(const ceph::heartbeat_handle_d*,
const char*, time_t)' thread 7f79fc66d700 time 2016-01-25 15:47:22.700751
> common/HeartbeatMap.cc: 81: FAILED assert(0 == "hit suicide timeout")
>
>  ceph version 9.2.0 (bb2ecea240f3a1d525bcb35670cb07bd1f0ca299)
>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x85) [0x7f7a019f4be5]
>  2: (ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d const*, char
const*, long)+0x2d9) [0x7f7a019343b9]
>  3: (ceph::HeartbeatMap::is_healthy()+0xd6) [0x7f7a01934bf6]
>  4: (ceph::HeartbeatMap::check_touch_file()+0x2c) [0x7f7a019353bc]
>  5: (CephContextServiceThread::entry()+0x15b) [0x7f7a01a10dcb]
>  6: (()+0x7df5) [0x7f79ffa8fdf5]
>  7: (clone()+0x6d) [0x7f79fe3381ad]
>
>
I don't know - maybe it's because number of leveldb files in omap folder
(total 5.1GB). Read somewhere that things can be improved by setting
'leveldb_compression' to false and leveldb_compact_on_mount to true but I
don't know if these options have any effect in 9.2.0 as they are not
documented for this release. Tried with 'leveldb_compression' but without
visible effect and wasn't brave enough with trying leveldb_compact_on_mount
on production env. But setting it to true on my test 0.94.5 makes osd
failing on restart.

Kind regards -
Krzysztof Księżyk


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux