On Fri, Nov 02, 2012 at 12:04:23PM +0800, Zhi Yong Wu wrote: > HI, guys > > VFS hot tracking currently show result as below, and it is very > strange and not nice. > > inode #279, reads 0, writes 1, avg read time 18446744073709551615, > avg write time 5251566408153596, temp 109 > > Do anyone know if there is one simpler but effective way to calculate > data temperature? Intuitively, data gets hot when it is accessed frequently, and cools down as the frequency of its access decreases. It is like heating water, the longer you heat it, the hotter it gets. The more intense the flame the faster it gets heated. So it is a function of both intensity and continuity. In the case of data, intensity can be mapped to the access frequency within a discrete interval. And the continuity can be mapped to sustainence of the same access frequency across further discrete intervals. Any reduction of intensity cools down the data. Any break in continuity cools down the data. We need to define the size of a 'discrete interval', and keep collecting the frequency of access in each discrete interval and take weighted average across all these discrete intervals. That should give us the temperature of the data. Here is an example to make my thoughts clear. Lets say we define discrete interval as 1sec. If a given peice of data is accessed 100 times in this 1sec, then the temperature of the data at the end of the 1sec will become (0+100)/2=50, where 0 is the temperature of the data at the beginning of the 1sec. In the next 1 sec, if the data is accessed 100 times again, then the temperature of the data at the end of the 2nd sec becomes (50+100)/2=75 and so on a so forth. If during any interval if the data is not accessed at all, the temperature goes down by half. This way of calculation biases the temperature towards most recent access. In other words a high amount of recent access significantly rises the temperature of the data compared to the same high amount of non-recent access. This technique will require us to capture *only* two information; the access frequency and the latest temperature of the data. A crude approach is to schedule a kernel daemon approximately every second to update the latest temperature using the access frequency since the last time the temperature got updated. But this approach is not scalable. The other approach is to decouple the frequency of temperature calculation from the size of the discrete interval. The kernel daemon thread can take its own sweet time to calculate the temperature. Whenever the daemon gets to calculate the new temperature of a given inode it can evenly spread the load, since the last temperature calculation, across all the discrete intervals. The formula for calculation will be (Ty+((2^y)-1)x)/(2^y). Where 'x' is the number of accesses in the last 'y' discrete intervals. 'T' is the previous temperature. Offcourse; I am approximating that there were exactly 'x/y' accesses in each of these 'y' discrete interval. This way of calculation will require one more field to be added in the datastructure, which is the timestamp when the last temperature got updated. RP -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html