Question about scalability

ukernel@xxxxxxxxx (Yan, Zheng) · Mon, 26 May 2014 21:22:43 +0800

On Mon, May 26, 2014 at 5:14 PM, Christian Balzer <chibi at gol.com> wrote:
>
> Hello,
>
> On Mon, 26 May 2014 10:28:12 +0200 Carsten Aulbert wrote:
>
>> Hi all
>>
>> first off, we have yet to start with Ceph (and other clustered file
>> systems other than QFS), therefore please consider me a total newbie
>> w.r.t to Ceph.
>>
> Firstly the usual heads up, CephFS is not considered ready for production
> due to the non-HA nature of MDS.
>

This is incorrect. HA of cephfs is achieved by means of active-standby MDS.

>> We are trying to solve disk I/O problems we face and would like to
>> explore if we could utilize our currently underused network more in
>> exchange for more disk performance. We do have a couple of machines at
>> our hands to use, but I would like to learn how well Ceph scales with a
>> large number of systems/disks.
>>
> It scales rather well, if you search the archives here on the net in
> general you will find a slide show from your colleagues at CERN.
>
>> In a safe approach, we could use 16 big boxes with 12 3 TB disks inside
>> and explore to use JBOD, hard- or software raid and 10Gb/s Ethernet
>> uplinks.
>>
> You're not really telling us what your I/O problems are and what your
> goals are, scientific storage needs frequently tend to be huge amounts of
> data (sequential). OTOH you might want/need more IOPS.
>
>> On the other hand we could scale out to about 1500-2500 machines, each
>> with local disks (500GB-1TB) and/or SSDs (60GB) inside.
>>
> This will give you in theory a very high performance cluster, given enough
> bandwidth between all those nodes.
> However I'd be very afraid of the administrative nightmare this would be,
> you'd need a group of people just healing/replacing nodes. ^o^
>
>> For now I have got two questions concerning this:
>>
>> (a) Would either approach work with O(2000) clients?
>>
> Really depends on what these clients are doing. See above.
>
>> (b) Would Ceph scale well enough to have O(2000) disks in the background
>> each connected with 1 Gb/s to the network?
>>
> I'd guess yes, but see above why this might not be such a great idea.
>
> Somebody from inktank or one of the big folks is sure to pipe up.
>
> Christian
>
>> Has anyone experience with these numbers of hosts or do people use
>> "access" nodes in between which export a Ceph file system via NFS or
>> similar systems?
>>
>> Cheers
>>
>> Carsten
>>
>> PS: As a first step, I think I'll go with 4-5 systems just to get a feel
>> for Ceph, scaling out will be a later exercise ;)
> --
> Christian Balzer        Network/Systems Engineer
> chibi at gol.com           Global OnLine Japan/Fusion Communications
> http://www.gol.com/
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com