On Mon, Jun 16, 2014 at 11:45:42PM -0400, Waiman Long wrote: > On 06/16/2014 04:59 PM, Kirill A. Shutemov wrote: > >On Mon, Jun 16, 2014 at 11:49:34PM +0300, Kirill A. Shutemov wrote: > >>On Mon, Jun 16, 2014 at 03:35:48PM -0400, Waiman Long wrote: > >>>In the __split_huge_page_map() function, the check for > >>>page_mapcount(page) is invariant within the for loop. Because of the > >>>fact that the macro is implemented using atomic_read(), the redundant > >>>check cannot be optimized away by the compiler leading to unnecessary > >>>read to the page structure. > >And atomic_read() is *not* atomic operation. It's implemented as > >dereferencing though cast to volatile, which suppress compiler > >optimization, but doesn't affect what CPU can do with the variable. > > > >So I doubt difference will be measurable anywhere. > > > > Because it is treated as an volatile object, the compiler will have to > reread the value of the relevant page structure field in every iteration of > the loop (512 for x86) when pmd_write(*pmd) is true. I saw some slight > improvement (about 2%) of a microbench that I wrote to break up 1000 THPs > with 1000 forked processes. Then bring patch with performance data. -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>