Some questions of radosgw

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 1, 2014 at 9:49 AM, Osier Yang <agedosier at gmail.com> wrote:
> [ correct the URL ]
>
>
> On 2014?08?02? 00:42, Osier Yang wrote:
>>
>> Hi, list,
>>
>> I managed to setup radosgw in testing environment to see if it's
>> stable/mature enough
>> for production use these several days. In the meanwhile, I tried to read
>> the source code
>> of radosgw to understand how it actually manages the underlying storage.
>>
>> The testing result shows the the write performance to a bucket is not
>> good, as far as I
>> understood from the code, it's caused by there is only *one* bucket index
>> object for a
>> single bucket, which is not nice in principle. And moreover, requests to
>> the whole bucket
>> could be blocked if the corresponding bucket index object happens to be in
>> recovering or
>> backfilling process. This is not acceptable in production use. Although I
>> saw Guang Yang
>> did some work (the prototype patches [1]) to try to resolve the problem
>> with the bucket
>> index sharding, I'm not quite confident about if it could solve the
>> problem from root,
>> since it looks like radosgw is trying to manage millions or billions
>> objects in one bucket
>> with the index, I'm a bit worried about it even the index sharding is
>> supported.
>>
>> Another problem I encounted is: when I upgraded radosgw to latest version
>> (Firefly),
>> radosgw-admin works well, read request works well too, but all write
>> request fails. Note
>> that I didn't do any changes on the config files, it means there is some
>> compactibilties
>> problems (client in new version fails to talk with ceph cluster in old
>> version). The error
>> looks like:
>>
>> 2014-07-31 10:13:10.045921 7fdb40ddd700 0 ERROR: can't read user header:
>> ret=-95
>> 2014-07-31 10:13:10.045930 7fdb40ddd700 0 ERROR: sync_user() failed,
>> user=osier ret=-95
>> 2014-07-31 17:00:56.075066 7fe514fe6780 0 ceph version 0.80.5
>> (38b73c67d375a2552d8ed67843c8a65c2c0feba6), process radosgw, pid 19974
>> 2014-07-31 17:00:56.197659 7fe514fe6780 0 framework: fastcgi
>> 2014-07-31 17:00:56.197666 7fe514fe6780 0 starting handler: fastcgi
>> 2014-07-31 17:00:56.198941 7fe4f8ff9700 0 ERROR: FCGX_Accept_r returned -9
>> 2014-07-31 17:00:56.211176 7fe4f9ffb700 0 ERROR: can't read user header:
>> ret=-95
>> 2014-07-31 17:00:56.211197 7fe4f9ffb700 0 ERROR: sync_user() failed,
>> user=Bob Dylon ret=-95
>> 2014-07-31 17:00:56.212306 7fe4f9ffb700 0 ERROR: can't read user header:
>> ret=-95
>> 2014-07-31 17:00:56.212325 7fe4f9ffb700 0 ERROR: sync_user() failed,
>> user=osier ret=-95

Did you upgrade the osds? Did you restart the osds after upgrade?

>>
>> With these two experience, I was starting to think about if radosgw is
>> stable/mature
>> enough yet. It seems that dreamhost is the only one using radosgw for
>> service, though
>> it seems there are use cases in private environments from google. I have
>> no way to
>> demonstrate if it's stable and mature enough for production use except
>> trying to understand
>> how it works, however, I guess everybody knows it will be too hard to go
>> back if a distributed
>> system is already in production use. So I'm asking here to see if I could
>> get some advices/
>> thoughts/suggestions from who already managed to setup radosgw for
>> production use.
>>
>> In case of the mail is long/boring enough, I'm submarizing my questions
>> here:
>>
>> 1) Is radosgw stable/mature enough for production use?

We consider it stable and mature for production use.

>>
>> 2) How it behaves in performance (especially on writing) in practice?

Different use cases and patterns have different performance
characteristics. As you mentioned, objects going to the same bucket
will contend on the bucket index. In the future we will be able to
shard that and it will mitigate the problem a bit. Other ideas are to
drop the bucket index altogether for use cases where object listing is
not really needed.

>>
>> 3) Any potential problems could be caused by addressing the millions or
>> billions objects with
>> index objects (even sharding is supported).
>>
>> 4) As far as I understood, it's better to not enable cache with multiple
>> radosgw deployment,
>> but is there any other ways to work around?

I'm not sure what you're referring to.


Yehuda

>>
>> 5) Is there any other potential traps?
>>
>> Much appreciated in advance.
>>
>> [1] http://news.gmane.org/gmane.comp.file-systems.ceph.devel
>
>
> Never mind, it's
> http://article.gmane.org/gmane.comp.file-systems.ceph.devel/20428
>
>
> Regards,
> Osier
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux