Re: Need advice on Ceph design

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I am following your blog which is awesome!

based on your explanation this is what i am thinking, I have hardware
and some consumer grade SSD in my stock so i am build my cluster using
those and will keep journal+data on same SSD after that i will run
some load test to see how it performing and later i will replace node
one by one to make it better, currently i have zero experience on ceph
don't know what is good and what is bad so at least this cluster will
give me some idea where i need to go.


I am planning to create SSD-Pool and HDD-Pool so keep both separate as
you also mentioned

I have 64GB memory so i think its enough for OSD node.

I am avoiding EC because i need performance my workload is VM.

I am using Openstack-ansible deployment took which has ceph-ansible integrated.

* Do you have any good or idle configuration which i should use or
take as an example for 5 node cluster?

* what WAL/DB journal size i should use or any recommendation ?



On Thu, Jul 19, 2018 at 3:16 AM, Sébastien VIGNERON
<sebastien.vigneron@xxxxxxxxx> wrote:
> Hi,
>
> First, I'm no expert, just have some experience with ceph.
>
> @work we did some benchmarks on Bluestore meta+data and Filestore with data on HDD and journal SSD, no big difference.
> We have choosen to have more disks and less complex configuration. I have not tested Bluestore with separate disks for meta and data.
>
> You can use SSD for WAL/rocksDB but the infrastructure will be more complexe to maintain.
> If one SSD fails, you will loose meta for many OSD. Depending on your pools configuration, you may break your cluster.
> If you use WAL+rocksDB+data per disk with Bluestore, the impact is not the same.
>
> The replication/EC choice will impact the performance and total space available in your cluster.
> The recommended minimum size for a replicated pool is 3 so you loose up to 2 OSD in a pool without loosing your data.
> So with size=3, your total raw space gets divided by 3. With 5 * 6 * 500GB = 15 TB raw, you can have max 5 TB available (before formatting). And you can loose 2 nodes without loosing data (if your crush map is correct).
> With EC, it's a bit different. Depending your (k,m) couple, you will save some space but lower your performance.
>
> Also, mixing SSD and HDD disks for data, not a good idea. They have different throughput, access times, ... you will at most have the performance of your slowest one (HDD).
>
> You can have 2 types of pools: one with high perf (SSD only) and one with moderated perf (HDD only). You will need to edit your crushmap and define a ssd device class.
>
> For your OSD nodes, I recommend at least 6 GB of RAM (1 GB of RAM per TB plus some for the OS).
>
> For the cluster administration, look for ansible-ceph.
>
> You should bench your disks and pools when created to see what is best for you.
>
> If anybody else has something to add, do not hesitate. ;-)
>
>
>> Le 18 juil. 2018 à 21:26, Satish Patel <satish.txt@xxxxxxxxx> a écrit :
>>
>> Thanks Sebastien,
>>
>> Let me answer all of your question which i missed out, Let me tell you
>> this is first cluster so i have no idea what would be best or worst
>> here, also you said we don't need SSD Journal for BlueStore but i
>> heard people saying  WAL/RockDB required SSD, can you explain?
>>
>> If i have SATA 500GB 7.5k HDD in that case running journal WAL/RockDB
>> on same OSD disk will slowdown right?
>>
>>
>>
>>
>> On Wed, Jul 18, 2018 at 2:42 PM, Sébastien VIGNERON
>> <sebastien.vigneron@xxxxxxxxx> wrote:
>>> Hello,
>>>
>>> What is your expected workload? VMs, primary storage, backup, objects storage, ...?
>>
>> All VMs only ( we are running openstack and all i need HA solution
>> live migration etc)
>>
>>> How many disks do you plan to put in each OSD node?
>>
>> 6 Disk per OSD node ( I have Samsung 850 EVO Pro 500GB  & SATA 500GB 7.5k)
>>
>>> How many CPU cores? How many RAM per nodes?
>>
>> 2.9GHz  (32 core in /proc/cpuinfo)
>>
>>> Ceph access protocol(s): CephFS, RBD or objects?
>>
>> RBD only
>>
>>> How do you plan to give access to the storage to you client? NFS, SMB, CephFS, ...?
>>
>> Openstack Nova / Cinder
>>
>>> Replicated pools or EC pools? If EC, k and m factors?
>>
>> I didn't thought of it, This is first cluster so don't know what would be best.
>>
>>> What OS (for ceph nodes and clients)?
>>
>> CentOS7.5  (Linux)
>>
>>>
>>> Recommandations:
>>> - For your information, Bluestore is not like Filestore, no need to have journal SSD. It's recommended for Bluestore to use the same disk for both WAL/RocksDB and datas.
>>> - For production, it's recommended to have dedicated MON/MGR nodes.
>>> - You may also need dedicated MDS nodes, depending the CEPH access protocol(s) you choose.
>>> - If you need commercial support afterward, you should see with a Redhat representative.
>>>
>>> Samsung 850 pro is consumer grade, not great.
>>>
>>>
>>>> Le 18 juil. 2018 à 19:16, Satish Patel <satish.txt@xxxxxxxxx> a écrit :
>>>>
>>>> I have decided to setup 5 node Ceph storage and following is my
>>>> inventory, just tell me is it good to start first cluster for average
>>>> load.
>>>>
>>>> 0. Ceph Bluestore
>>>> 1. Journal SSD (Intel DC 3700)
>>>> 2. OSD disk Samsung 850 Pro 500GB
>>>> 3. OSD disk SATA 500GB (7.5k RPMS)
>>>> 4. 2x10G NIC (separate public/cluster with JumboFrame)
>>>>
>>>> Do you thin this combination is good for average load?
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@xxxxxxxxxxxxxx
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux