Re: Reliability model for RADOS - effects during second failures

Loic Dachary <loic@xxxxxxxxxxx> · Thu, 03 Jul 2014 09:10:37 +0200

Hi koleosfuscus,

On 03/07/2014 00:33, Koleos Fuscus wrote:
> Hi Kyle, Loic,
> 
> The current code uses a “FIT rate multiplier” to include for instance
> the effect of operations done in parallel. That multiplier (n) has an
> effect on Pfail. In the initial failure, it is calculated using the
> number of replicas and the stripe count as seen in
> https://github.com/ceph/ceph-tools/blob/master/models/reliability/RadosRely.py#L86.
> 
> The thing that doesn’t have sense to me is the way the multiplier is
> calculated for the failure of the remaining copies in
> https://github.com/ceph/ceph-tools/blob/master/models/reliability/RadosRely.py#L92
> Why the stripes are not taking into account? What is the purpose of
> using the “declustering factor” on that equation? Is that equation
> correct? I read this note by sage
> https://www.mail-archive.com/ceph-devel@xxxxxxxxxxxxxxx/msg01650.html
> trying to clarify the role of PGs but didn’t help me to understand it.

At the risk of adding confusion to the discussion, does the current reliability model make room to take into account what is described in anrg.usc.edu/~maheswaran/Xorbas.pdf under "4. Reliability Analysis" ? In other words, is there a place where one could set things like "disk fail % of the time" and "network is X Gb/s" and "repairing a disk failure requires disk require reading B bytes from M disks" ? As far as I understand, such factors cannot be expressed with a single formula and this is why a Markov model is useful.

> Besides, I have a simple question related with the equation on L86 for
> the initial failure. The stripping process splits user content in
> #number of objects, which equivalent to the stripe count. That group
> of objects constitutes an object set. Each object is composed by one
> or more stripes units. All stripes units (stripe count) are written in
> parallel. Typically each object is mapped to a different disk.  What
> happen when the object set is full and a new object is started? Are
> this new objects assigned to same disks used for the previous full
> object set?

In an ideal situation, if a disk / OSD is full it means the whole cluster is full. Is it reasonable to ignore this situation when thinking about the reliability model ? If not could you explain how ?

Cheers 
> 
> Best
> 
> koleosfuscus
> 
> ________________________________________________________________
> "My reply is: the software has no known bugs, therefore it has not
> been updated."
> Wietse Venema
> 

-- 
Loïc Dachary, Artisan Logiciel Libre

Attachment:
signature.asc

Description: OpenPGP digital signature