> From: Paul E. McKenney <paulmck@xxxxxxxxxx> > [...] > > > > Thank you so much Paul for the detailed comments on the measured data. > > > > I'm curious how did you figure out the number 24 that we at *least* need. > > This can guide me on whether the number of samples is enough for > > future testing ;-). > > It is a rough rule of thumb. For more details and accuracy, study up on the > Student's t-test and related statistical tests. > > Of course, this all assumes that the data fits a normal distribution. Thanks for this extra information. Good to know the Student's t-test. > > I did another 48 measurements (2x of 24) for each case (w/o and w/ > > Joel's v2 patch) as below. > > All the testing configurations for the new testing are the same as > > before. > > > > a) Measured 48 times w/o v2 patch (in seconds): > > 8.4, 8.8, 9.2, 9.0, 8.3, 9.6, 8.8, 9.4, > > 8.7, 9.2, 8.3, 9.4, 8.4, 9.6, 8.5, 8.8, > > 8.8, 8.9, 9.3, 9.2, 8.6, 9.7, 9.2, 8.8, > > 8.7, 9.0, 9.1, 9.5, 8.6, 8.9, 9.1, 8.6, > > 8.2, 9.1, 8.8, 9.2, 9.1, 8.9, 8.4, 9.0, > > 9.8, 9.8, 8.7, 8.8, 9.1, 9.5, 9.5, 8.7 > > The average OS boot time was: ~9.0s > > The range is 8.2 through 9.8. > > > b) Measure 48 times w/ v2 patch (in seconds): > > 7.7, 8.6, 8.1, 7.8, 8.2, 8.2, 8.8, 8.2, > > 9.8, 8.0, 9.2, 8.8, 9.2, 8.5, 8.4, 9.2, > > 8.5, 8.3, 8.1, 8.3, 8.6, 7.9, 8.3, 8.3, > > 8.6, 8.9, 8.0, 8.5, 8.4, 8.6, 8.7, 8.0, > > 8.8, 8.8, 9.1, 7.9, 9.7, 7.9, 8.2, 7.8, > > 8.1, 8.5, 8.6, 8.4, 9.2, 8.6, 9.6, 8.3, > > The average OS boot time was: ~8.5s > > The range is 7.7 through 9.8. > > There is again significant overlap, so it is again unclear that you have a > statistically significant difference. So could you please calculate the standard > deviations? a's standard deviation is ~0.4. b's standard deviation is ~0.5. a's average 9.0 is at the upbound of the standard deviation of b's [8.0, 9]. So, the measurements should be statistically significant to some degree. The calculated standard deviations are via: https://www.gigacalculator.com/calculators/standard-deviation-calculator.php > > @Joel Fernandes (Google), you may replace my old data with the above > > new data in your commit message. > > > > > But we can apply the binomial distribution instead of the usual > > > normal distribution. First, let's sort and take the medians: > > > > > > a: 8.2 8.3 8.4 8.6 8.7 8.7 8.8 8.8 9.0 9.3 Median: 8.7 > > > b: 7.6 7.8 8.2 8.2 8.2 8.2 8.4 8.5 8.7 9.3 Median: 8.2 > > > > > > 8/10 of a's data points are greater than 0.1 more than b's median > > > and 8/10 of b's data points are less than 0.1 less than a's median. > > > What are the odds that this happens by random chance? > > > > > > This is given by sum_0^2 (0.5^10 * binomial(10,i)), which is about 0.055. > > > > What's the meaning of 0.5 here? Was it the probability (we assume?) > > that each time b's data point failed (or didn't satisfy) "less than > > 0.1 less than a's median"? > > The meaning of 0.5 is the probability of a given data point being on one side > or the other of the corresponding distribution's median. This of course > assumes that the median of the measured data matches that of the > corresponding distribution, though the fact that the median is also a mode of > both of the old data sets gives some hope. Thanks for the detailed comments on the meaning of 0.5 here. :-) > The meaning of the 0.1 is the smallest difference that the data could measure. > I could have instead chosen 0.0 and asked if there was likely some (perhaps > tiny) difference, but instead, I chose to ask if there was likely some small but > meaningful difference. It is better to choose the desired difference before > measuring the data. Thanks for the detailed comments on the meaning of 0.1 here. :-) > Why don't you try applying this approach to the new data? You will need the > general binomial formula. Thank you Paul for the suggestion. I just tried it, but not sure whether my analysis was correct ... Analysis 1: a's median is 8.9. 35/48 b's data points are less than 0.1 less than a's median. For a's binomial distribution P(X >= 35) = 0.1%, where p=0.5. So, we have strong confidence that b is 100ms faster than a. Analysis 2: a's median - 0.4 = 8.9 - 0.4 = 8.5. 24/48 b's data points are less than 0.4 less than a's median. The probability that a's data points are less than 8.5 is p = 7/48 = 0.1458 For a's binomial distribution P(X >= 24) = 0.0%, where p=0.1458. So, looks like we have confidence that b is 400ms faster than a. The calculated cumulative binomial distributions P(X) is via: https://www.gigacalculator.com/calculators/binomial-probability-calculator.php I apologize if this analysis/discussion bored some of you. ;-) -Qiuxu > [...]