Re: Question about reliability model result

dahan <dahanhsi@xxxxxxxxx> · Mon, 31 Aug 2015 17:09:39 +0800

Maybe it's just a precision problem?

I calculate the durability from PL(*) columns with the formula: 1-PL(site)-PL(copy)-PL(NRE).

Result:
2-cp is 0.99896562 
3-cp is 0.99900049

Both of them are approximates to 99.9% 

Actually the model result is 99.900%. Maybe the author wants us to ignore the last few zeros lol.

On Fri, Aug 28, 2015 at 6:26 PM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
I haven't looked at the internals of the model, but the PL(site)

you've pointed out is definitely the crux of the issue here. In the

first grouping, it's just looking at the probability of data loss due

to failing disks, and as the copies increase that goes down. In the

second grouping, it's including other factors like the entire data

center getting knocked out. That possibility is greater than losing

data due to three disk failures here, so it's capping the total data

durability.

-Greg

On Sat, Aug 22, 2015 at 2:38 AM, dahan <dahanhsi@xxxxxxxxx> wrote:

> Hi,

> I have crosspost this issue here and in github,

> but no response yet.

>

> Any advice?

>

> On Mon, Aug 10, 2015 at 10:21 AM, dahan <dahanhsi@xxxxxxxxx> wrote:

>>

>>

>> Hi all, I have tried the reliability model:

>> https://github.com/ceph/ceph-tools/tree/master/models/reliability

>>

>> I run the tool with default configuration, and cannot understand the

>> result.

>>

>> ```

>>     storage               durability    PL(site)  PL(copies)     PL(NRE)

>> PL(rep)    loss/PiB

>>     ----------            ----------  ----------  ----------  ----------

>> ----------  ----------

>>     Disk: Enterprise         99.119%   0.000e+00   0.721457%   0.159744%

>> 0.000e+00   8.812e+12

>>     RADOS: 1 cp              99.279%   0.000e+00   0.721457%   0.000865%

>> 0.000e+00   5.411e+12

>>     RADOS: 2 cp              7-nines   0.000e+00   0.000049%   0.003442%

>> 0.000e+00   9.704e+06

>>     RADOS: 3 cp             11-nines   0.000e+00   5.090e-11   3.541e-09

>> 0.000e+00   6.655e+02

>> ```

>>

>> ```

>>     storage               durability    PL(site)  PL(copies)     PL(NRE)

>> PL(rep)    loss/PiB

>>     ----------            ----------  ----------  ----------  ----------

>> ----------  ----------

>>     Site (1 PB)              99.900%   0.099950%   0.000e+00   0.000e+00

>> 0.000e+00   9.995e+11

>>     RADOS: 1-site, 1-cp      99.179%   0.099950%   0.721457%   0.000865%

>> 0.000e+00   1.010e+12

>>     RADOS: 1-site, 2-cp      99.900%   0.099950%   0.000049%   0.003442%

>> 0.000e+00   9.995e+11

>>     RADOS: 1-site, 3-cp      99.900%   0.099950%   5.090e-11   3.541e-09

>> 0.000e+00   9.995e+11

>>

>> ```

>>

>> The two result tables have different trend. In the first table, durability

>> value is 1 cp < 2 cp < 3 cp. However, the second table results in 1 cp < 2

>> cp = 3 cp.

>>

>> The two tables have the same PL(site),  PL(copies) , PL(NRE), and PL(rep).

>> The only difference is PL(site). PL(site) is constant, since number of site

>> is constant. The trend should be the same.

>>

>> How to explain the result?

>>

>> Anything I missed out? Thanks

>>

>

>

> _______________________________________________

> ceph-users mailing list

> ceph-users@xxxxxxxxxxxxxx

> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com