On Thu, Mar 19, 2015 at 3:41 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > My recollection wasn't faulty - I pulled it from an earlier email. > That said, the original measurement might have been faulty. I ran > the numbers again on the 3.19 kernel I saved away from the original > testing. That came up at 235k, which is pretty much the same as > yesterday's test. The runtime,however, is unchanged from my original > measurements of 4m54s (pte_hack came in at 5m20s). Ok. Good. So the "more than an order of magnitude difference" was really about measurement differences, not quite as real. Looks like more a "factor of two" than a factor of 20. Did you do the profiles the same way? Because that would explain the differences in the TLB flush percentages too (the "1.4% from tlb_invalidate_range()" vs "pretty much everything from migration"). The runtime variation does show that there's some *big* subtle difference for the numa balancing in the exact TNF_NO_GROUP details. It must be *very* unstable for it to make that big of a difference. But I feel at least a *bit* better about "unstable algorithm changes a small varioation into a factor-of-two" vs that crazy factor-of-20. Can you try Mel's change to make it use if (!(vma->vm_flags & VM_WRITE)) instead of the pte details? Again, on otherwise plain 3.19, just so that we have a baseline. I'd be *so* much happer with checking the vma details over per-pte details, especially ones that change over the lifetime of the pte entry, and the NUMA code explicitly mucks with. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>