I am following your blog which is awesome! based on your explanation this is what i am thinking, I have hardware and some consumer grade SSD in my stock so i am build my cluster using those and will keep journal+data on same SSD after that i will run some load test to see how it performing and later i will replace node one by one to make it better, currently i have zero experience on ceph don't know what is good and what is bad so at least this cluster will give me some idea where i need to go. I am planning to create SSD-Pool and HDD-Pool so keep both separate as you also mentioned I have 64GB memory so i think its enough for OSD node. I am avoiding EC because i need performance my workload is VM. I am using Openstack-ansible deployment took which has ceph-ansible integrated. * Do you have any good or idle configuration which i should use or take as an example for 5 node cluster? * what WAL/DB journal size i should use or any recommendation ? On Thu, Jul 19, 2018 at 3:16 AM, Sébastien VIGNERON <sebastien.vigneron@xxxxxxxxx> wrote: > Hi, > > First, I'm no expert, just have some experience with ceph. > > @work we did some benchmarks on Bluestore meta+data and Filestore with data on HDD and journal SSD, no big difference. > We have choosen to have more disks and less complex configuration. I have not tested Bluestore with separate disks for meta and data. > > You can use SSD for WAL/rocksDB but the infrastructure will be more complexe to maintain. > If one SSD fails, you will loose meta for many OSD. Depending on your pools configuration, you may break your cluster. > If you use WAL+rocksDB+data per disk with Bluestore, the impact is not the same. > > The replication/EC choice will impact the performance and total space available in your cluster. > The recommended minimum size for a replicated pool is 3 so you loose up to 2 OSD in a pool without loosing your data. > So with size=3, your total raw space gets divided by 3. With 5 * 6 * 500GB = 15 TB raw, you can have max 5 TB available (before formatting). And you can loose 2 nodes without loosing data (if your crush map is correct). > With EC, it's a bit different. Depending your (k,m) couple, you will save some space but lower your performance. > > Also, mixing SSD and HDD disks for data, not a good idea. They have different throughput, access times, ... you will at most have the performance of your slowest one (HDD). > > You can have 2 types of pools: one with high perf (SSD only) and one with moderated perf (HDD only). You will need to edit your crushmap and define a ssd device class. > > For your OSD nodes, I recommend at least 6 GB of RAM (1 GB of RAM per TB plus some for the OS). > > For the cluster administration, look for ansible-ceph. > > You should bench your disks and pools when created to see what is best for you. > > If anybody else has something to add, do not hesitate. ;-) > > >> Le 18 juil. 2018 à 21:26, Satish Patel <satish.txt@xxxxxxxxx> a écrit : >> >> Thanks Sebastien, >> >> Let me answer all of your question which i missed out, Let me tell you >> this is first cluster so i have no idea what would be best or worst >> here, also you said we don't need SSD Journal for BlueStore but i >> heard people saying WAL/RockDB required SSD, can you explain? >> >> If i have SATA 500GB 7.5k HDD in that case running journal WAL/RockDB >> on same OSD disk will slowdown right? >> >> >> >> >> On Wed, Jul 18, 2018 at 2:42 PM, Sébastien VIGNERON >> <sebastien.vigneron@xxxxxxxxx> wrote: >>> Hello, >>> >>> What is your expected workload? VMs, primary storage, backup, objects storage, ...? >> >> All VMs only ( we are running openstack and all i need HA solution >> live migration etc) >> >>> How many disks do you plan to put in each OSD node? >> >> 6 Disk per OSD node ( I have Samsung 850 EVO Pro 500GB & SATA 500GB 7.5k) >> >>> How many CPU cores? How many RAM per nodes? >> >> 2.9GHz (32 core in /proc/cpuinfo) >> >>> Ceph access protocol(s): CephFS, RBD or objects? >> >> RBD only >> >>> How do you plan to give access to the storage to you client? NFS, SMB, CephFS, ...? >> >> Openstack Nova / Cinder >> >>> Replicated pools or EC pools? If EC, k and m factors? >> >> I didn't thought of it, This is first cluster so don't know what would be best. >> >>> What OS (for ceph nodes and clients)? >> >> CentOS7.5 (Linux) >> >>> >>> Recommandations: >>> - For your information, Bluestore is not like Filestore, no need to have journal SSD. It's recommended for Bluestore to use the same disk for both WAL/RocksDB and datas. >>> - For production, it's recommended to have dedicated MON/MGR nodes. >>> - You may also need dedicated MDS nodes, depending the CEPH access protocol(s) you choose. >>> - If you need commercial support afterward, you should see with a Redhat representative. >>> >>> Samsung 850 pro is consumer grade, not great. >>> >>> >>>> Le 18 juil. 2018 à 19:16, Satish Patel <satish.txt@xxxxxxxxx> a écrit : >>>> >>>> I have decided to setup 5 node Ceph storage and following is my >>>> inventory, just tell me is it good to start first cluster for average >>>> load. >>>> >>>> 0. Ceph Bluestore >>>> 1. Journal SSD (Intel DC 3700) >>>> 2. OSD disk Samsung 850 Pro 500GB >>>> 3. OSD disk SATA 500GB (7.5k RPMS) >>>> 4. 2x10G NIC (separate public/cluster with JumboFrame) >>>> >>>> Do you thin this combination is good for average load? >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@xxxxxxxxxxxxxx >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com