On Thu, Jun 23, 2011 at 7:25 AM, Andrea Arcangeli <aarcange@xxxxxxxxxx> wrote: > On Thu, Jun 23, 2011 at 07:13:54AM +0800, Nai Xia wrote: >> I agree on this point. Dirty bit , young bit, is by no means accurate. Even >> on 4kB pages, there is always a chance that the pte are dirty but the contents >> are actually the same. Yeah, the whole optimization contains trade-offs and > > Just a side note: the fact the dirty bit would be set even when the > data is the same is actually a pros, not a cons. If the content is the > same but the page was written to, it'd trigger a copy on write short > after merging the page rendering the whole exercise wasteful. The > cksum plays a double role, it both "stabilizes" the unstable tree, so > there's less chance of bad lookups, but it also avoids us to merge > stuff that is written to frequently triggering copy on writes, and the > dirty bit would also catch overwrites with the same data, something > the cksum can't do. Good point. I actually have myself another version of ksm(off topic, but if you want to take a glance: http://code.google.com/p/uksm/ :-) ) that did do statistics of the ratio of the pages in a VMA that really got COWed. due to KSM merging on each scan round basis. It's complicated to deduce a precise information only from the dirty and cksum. Thanks, Nai > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html