Re: Designing an application with Ceph

Nulik Nol <nuliknol@xxxxxxxxx> · Thu, 15 Aug 2013 13:36:26 -0500

Thanks, id didn't know about omap, but it is a good idea. I also found
that Eleanor Cawthon made a tree balancing project over OSDs. After
analyzing a bit more, I found that some librados and omap functions
aren't asynchronous. This is a considerable disadvantage when writing
a service where you expect high load. With synchronous calls I would
keep in  wait queue a lot of client connections So, I think the best
solution in my case would be to write key/values in my own format over
OSDs storing the data in chunks (ceph objects) of say 64KB , like
'pages' in a traditional DB engines. This way, it will be faster than
omap implementation and it also will work with asynchronous calls.

On Tue, Aug 13, 2013 at 6:09 PM, Samuel Just <sam.just@xxxxxxxxxxx> wrote:
> 2 is certainly an intriguing option.  RADOS isn't really a database
> engine (even a nosql one), but should be able to serve your needs
> here.  Have you seen the omap api available in librados?  It allows
> you to efficiently store key/value pairs attached to a librados object
> (uses leveldb on the OSDs to actually handle the key/value mapping).
>
> One caveat is that the C api is somewhat less complete than the C++
> api.  That would be pretty easily remedied if there were demand
> though.
> -Sam
>
> On Tue, Aug 13, 2013 at 2:01 PM, Nulik Nol <nuliknol@xxxxxxxxx> wrote:
>> Hi,
>> I am planning to use Ceph as a database storage for a webmail
>> client/server application, and I am thinking to store the data as
>> key/value pair instead of using any RDBMSs, for speed. The webmail
>> will manage companies, and each company will have many users, users
>> will end/receive emails and store them in their inboxes, kind of like
>> Gmail, but per company. The server will be developed in C, client code
>> in HTML/Javascript and binary client (standalone app) in C++
>> So, my question is, how would you recommend me to design the backend ?
>>
>> I have thought of these choices:
>>
>> 1. Use Ceph as filesystem and BerkeleyDB as the database engine.
>> Berekley DB uses 2 files per table, so I will have 1 directory per
>> company and a 2 files per each table, I think there will be no more
>> than 20 tables in my whole app. Ceph will be used here as a remote
>> filesystem where BerkeleyDB will do all the data organization. The
>> RADOS interface of Ceph (to store key/pair values) will be not used,
>> since Berkeley DB will write and read to the OSDs directly and
>> Berkeley DB is a key/value pair database. But I have never used a DB
>> one a remote filesystem not sure if it will work well. Advantages of
>> this architecture: quick & easy.
>> Disadvantages: lower performance (overhead in CephFS and BerkeleyDB),
>> also I will not be able to write plugins for RADOS in C++ to combine
>> many data modifications in a single call to the server.
>>
>> 2. Use librados C api and write all the 'queries' hardcoded in C
>> specifically for the
>> application. Since the application is pretty standard and is not
>> supposed to change
>> much, I can do this. I would create a RADOS object for each
>> application object (like for example 'user' record, 'email' record,
>> 'chat message' record, etc...).
>> Advantages: high performance. Disadvantages: a bit more to code ,
>> specially the data search functions.
>>
>> I am interested in performance, so I am thinking to go for the option
>> 2, what do you think? Can RADOS fully replace a database engine ? (I
>> mean, NoSQL engine, like Berkeley for example)
>>
>> Will appreciate very much your comments.
>> TIA
>> Nulik
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com