> Is an object a CephFS file or a RBD image or is it the 4MB blob on the > actual OSD FS? Objects are at the RADOS level, CephFS filesystems, RBD images and RGW objects are all composed by striping RADOS objects - default is 4MB. > In my case, I'm only looking at RBD images for KVM volume storage, even > given the default striping configuration I would assume that those 12500 > OSD objects for a 50GB image would not be in the same PG and thus just on > 3 (with 3 replicas set) OSDs total? Objects are striped across placement groups, so you take your RBD size / 4 MB and cap it at the total number of placement groups in your cluster. > What amount of disks (OSDs) did you punch in for the following run? >> Disk Modeling Parameters >> size: 3TiB >> FIT rate: 826 (MTBF = 138.1 years) >> NRE rate: 1.0E-16 >> RADOS parameters >> auto mark-out: 10 minutes >> recovery rate: 50MiB/s (40 seconds/drive) > Blink??? > I guess that goes back to the number of disks, but to restore 2.25GB at > 50MB/s with 40 seconds per drive... The surviving replicas for placement groups that the failed OSDs participated will naturally be distributed across many OSDs in the cluster, when the failed OSD is marked out, it's replicas will be remapped to many OSDs. It's not a 1:1 replacement like you might find in a RAID array. >> osd fullness: 75% >> declustering: 1100 PG/OSD >> NRE model: fail >> object size: 4MB >> stripe length: 1100 > I take it that is to mean that any RBD volume of sufficient size is indeed > spread over all disks? Spread over all placement groups, the difference is subtle but there is a difference. -- Kyle _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com