Hi koleosfuscus, From http://www.kaymgee.com/Kevin_Greenan/Software_files/hfrs.tar downloaded from http://www.kaymgee.com/Kevin_Greenan/Software.html In hfrs/models/weaver_8_8_3.disk.ber.model [num states] 4 0 1 a failure 1 0 b repair 1 2 c failure 2 1 d repair 2 3 e failure [assign] a=N*lam_d b=mu c=(N-1)*lam_d d=2*mu e=(N-2)*lam_d N=8 lam_d=(1/461386.) mu=(1/12.) [END] is semi-human parsable but hfrs/models/weaver_8_8_3.disk.ber.model [num states] 5 0 1 a failure 0 4 b failure 1 2 c failure 1 4 d failure 1 0 e repair 2 3 f failure 2 4 g failure 2 1 h repair 3 4 i failure 3 2 j repair [assign] a=(N-0)*lam_d*(1-0.000000)*(1-(0.000000*(1-(1-p)**(N-1)))) b=(N-0)*lam_d*(0.000000)+(N-0)*lam_d*(1-0.000000)*((0.000000*(1-(1-p)**(N-1)))) c=(N-1)*lam_d*(1-0.000000)*(1-(0.000000*(1-(1-p)**(N-2)))) d=(N-1)*lam_d*(0.000000)+(N-1)*lam_d*(1-0.000000)*((0.000000*(1-(1-p)**(N-2)))) e=1*mu f=(N-2)*lam_d*(1-0.000000)*(1-(0.114286*(1-(1-p)**(N-3)))) g=(N-2)*lam_d*(0.000000)+(N-2)*lam_d*(1-0.000000)*((0.114286*(1-(1-p)**(N-3)))) h=2*mu i=(N-3)*lam_d j=3*mu N=8 lam_d=(1/461386.) mu=(1/12.) p=0.0237 [END] [Disk sector conditional fault tolerance] [[0.0, 0.0, 0.0, 0.0, 0.0043956043956043956, 0.02197802197802198, 0.075924075924075921], [0.0, 0.0, 0.0, 0.01098901098901099, 0.057942057942057944, 0.19780219780219779, 1.0], [0.0, 0.0, 0.034632034632034632, 0.16623376623376623, 0.49494949494949497, 1.0, 1.0], [0.0, 0.11428571428571428, 0.44126984126984126, 0.98333333333333328, 1.0, 1.0, 1.0]] Kevin write that "The HFRS uses an extremely efficient mathematical technique, called importance sampling, which enables the observation of extremely low-probability events. I have implemented (and derived in my thesis) efficient simulation algorithms under both exponential and Weibull failure/repairs. The combination of these techniques, in addition to a custom Markov model solver, makes the HFRS an extremely useful tool for evaluating storage system reliability." meaning you need to understand both https://en.wikipedia.org/wiki/Markov_model and https://en.wikipedia.org/wiki/Importance_sampling as well as the semantics of the input file which is documented in the README. Nice find koleosfuscus :-) Cheers On 07/07/2014 17:19, Koleos Fuscus wrote: > Hello Loic, > > You ask previously: > In other words, is there a place where one could set things like "disk > fail % of the time" and "network is X Gb/s" and "repairing a disk > failure requires disk require reading B bytes from M disks" ? As far > as I understand, such factors cannot be expressed with a single > formula and this is why a Markov model is useful. > > I think we need to run simulations to have a more precise estimation > of the reliability of an erasure coded system. Markov models are not > as flexible as you may think. Besides, solving equations when the > number of components that may fail is large makes the problem not > trivial. Maybe standard simulation is enough. As observed by Greenan > in his thesis, standard simulations have problems with rare events > which may not be observed during simulation time. I don't know if we > should care about rare events for comparing methods.. > > Greenan released the software used for his thesis. It is completely > developed in Python. > http://www.kaymgee.com/Kevin_Greenan/Software.html > > I found Greenan tool while trying to validate the results of ceph-tool > and the numbers are completely different: > > For instance: > > Parameters for ceph tool: > Disk type consumer, FIT1=2167, FIT2=2167 > Size: 2000GiB > RAID-6 > Replace 0h > Rebuild 6000MiB/s > Volumes:8 > NRE model: ignore > Period: 10 years > > (I used this numbers to compared with model 2DFT.disk.model of Greenan tool) > > Parameters for Greenan HFRS tool > python mm_solve.py -m 2DFT.disk.model -M > > Results > > CEPH: > > storage durability PL(site) PL(copies) > PL(NRE) PL(rep) loss/PiB > > ---------- ---------- ---------- ---------- > ---------- ---------- ---------- > > RAID-6: 6+2 11-nines 0.000e+00 1.318e-12 > 0.000e+00 0.000e+00 9.887e+02 > > > HRFS: > > Analytic MTTDL: 4.06111903031e+12 > ********************* > Analytic prob. of failure: 2.15660e-08 > ********************* > > Could you check if the parameters for ceph are correct and equivalent > to HRFS model?Do you think it has sense to include Greenan tool. > Greenan has a number of models including nonMDS codes. I am not sure > yet how we can describe the LRC code in this platform but it might be > possible. > > koleosfuscus > > ________________________________________________________________ > "My reply is: the software has no known bugs, therefore it has not > been updated." > Wietse Venema > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Loïc Dachary, Artisan Logiciel Libre
Attachment:
signature.asc
Description: OpenPGP digital signature