Den tis 19 nov. 2024 kl 03:15 skrev Christoph Pleger <Christoph.Pleger@xxxxxxxxxxxxxxxxx>: > Hello, > Is it possible to have something like RAID0 with Ceph? > That is, when the cluster configuration file contains > > osd pool default size = 4 This means all data is replicated 4 times, in your case, one piece per OSD, which also in your case, means one piece per host. You will at most be able to fit s MByte into this pool, since it will eat 4 times this size in raw disk for its 4 copies. > and I have four hosts with one osd drive (all the same size, let's call > it size s) per host, is it somehow possible to add four other hosts > with one osd drive (again with size s) per host, so that the resulting > Ceph block device is of size 2 * s? You seen to use the term "ceph block device" in an odd way. The common use of the word means "an RBD image that a ceph client mounts", and the size of that will not change if you add more hosts. It will be allowed to grow if more hosts appear, since the pool can now take larger images, or more of them but the size stays fixed. If you mean "the pool on which my images reside" instead, then the answer is "if you add 4 more hosts to your existing 4, then you can use twice the amount of storage". I'm not sure which kind of confusion is here, but just let me state a few things about ceph in the hope of making your view clearer about how it works: 1. the "size" if the pool only controls the number of copies each object in it will have, it does not control a number of MB/GB/TB. All pools grow as you put objects in them, until storage runs out (it will stop before 95% but still..) 2. The pool has a number of PGs, set at creation but editable later on, and these also do not control the size of the pool, only how it spreads on the various OSDs and hosts. One should aim for something like 100-200 PGs per OSD, so in a 4-OSD/4-host case like your example, and with size=4, the pool should have 128 PGs, which means 128 * 4 (for size) ends up on 4 OSDs. If/when you add 4 more OSDs, bump the pool to 256 PGs. 3. RAID0 is basically about letting data stretch onto several drives. This is how ceph (and many other storage clusters) work by default. There is no settings you have to tune or figure out for it to allow you to use new disks. You may later want to prevent this, for instance if you want to run one pool on spindrives and other pools on ssd/nvme, then you would actively config it to not use whatever disks are added. 4. If we are talking about RBD images, like the ones used for openstack or proxmox VMs with ceph block storage, then those are internally split up into lots and lots of pieces by librbd, so when you ask for say a 40G drive for your VM, you are actually getting lots and lots of 4M (or 2M?) pieces, that in total sums up to 40G. Each of these pieces end up on a pseudo-randomly chosen OSD, meaning that your thousands of pieces spread onto the 128 PGs in a mostly very even way. This is sort-of acting a bit like raid0/jbod in some fashion, if you squint your eyes a bit. The important part is that when your VM reads or writes to the whole of its 40G block device, it will involve ALL the OSD drives, which is how you want a storage cluster to work. -- May the most significant bit of your life be positive. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx