On Sat 16-04-11 10:33:29, Peter Zijlstra wrote: > On Sat, 2011-04-16 at 00:13 +0200, Jan Kara wrote: > > > > So what is a takeaway from this for me is that scaling the period > > with the dirty limit is not the right thing. If you'd have 4-times more > > memory, your choice of "dirty limit" as the period would be as bad as > > current 4*"dirty limit". What would seem like a better choice of period > > to me would be to have the period in an order of a few seconds worth of > > writeback. That would allow the bdi limit to scale up reasonably fast when > > new bdi starts to be used and still not make it fluctuate that much > > (hopefully). > > No best would be to scale the period with the writeout bandwidth, but > lacking that the dirty limit had to do. Since we're counting pages, and > bandwidth is pages/second we'll end up with a time measure, exactly the > thing you wanted. Yes, I was thinking about this as well. We could measure the throughput but essentially it's a changing entity (dependent on the type of load and possibly other things like network load for NFS, or other machines accessing your NAS). So I'm not sure one constant value will work (esp. because you have to measure it and you never know at which state you did the measurement). And when you have changing values, you have to solve the same problem as with time based periods - that's how I came to them. > > Looking at math in lib/proportions.c, nothing really fundamental requires > > that each period has the same length. So it shouldn't be hard to actually > > create proportions calculator that would have timer triggered periods - > > simply whenever the timer fires, we would declare a new period. The only > > things which would be broken by this are (t represents global counter of > > events): > > a) counting of periods as t/period_len - we would have to maintain global > > period counter but that's trivial > > b) trick that we don't do t=t/2 for each new period but rather use > > period_len/2+(t % (period_len/2)) when calculating fractions - again we > > would have to bite the bullet and divide the global counter when we declare > > new period but again it's not a big deal in our case. > > > > Peter what do you think about this? Do you (or anyone else) think it makes > > sense? > > But if you don't have a fixed sized period, then how do you catch up on > fractions that haven't been updated for several periods? You cannot go > remember all the individual period lengths. OK, I wrote the expressions down and the way I want to do it would get different fractions than your original formula: Your formula is: p(j)=\sum_i x_i(j)/(t_i*2^{i+1}) where $i$ sums from 0 to \infty, x_i(j) is the number of events of type $j$ in period $i$, $t_i$ is the total number of events in period $i$. I want to compute l(j)=\sum_i x_i(j)/2^{i+1} g=\sum_i t_i/2^{i+1} and p(j)=l(j)/g Clearly, all these values can be computed in O(1). Now for t_i = t for every i, the results of both formulas are the same (which is what made me make my mistake). But when t_i differ, the results are different. I'd say that the new formula also provides a meaningful notion of writeback share although it's hard to quantify how far the computations will be in practice... > The whole trick to the proportion stuff is that its all O(1) regardless > of the number of contestants. There isn't a single loop that iterates > over all BDIs or tasks to update their cycle, that wouldn't have scaled. Sure, I understand. Honza -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html