From: Pasha Tatashin <pasha.tatashin@xxxxxxxxxx> Date: Thu, 11 May 2017 16:47:05 -0400 > So, moving memset() into __init_single_page() benefits Intel. I am > actually surprised why memset() is so slow on intel when it is called > from memblock. But, hurts SPARC, I guess these membars at the end of > memset() kills the performance. Perhaps an x86 expert can chime in, but it might be the case that past a certain size, the microcode for the enhanced stosb uses non-temporal stores or something like that. As for sparc64, yes we can get really killed by the transactional cost of memset because of the membars. But I wonder, for a single page struct, if we even use the special stores and thus eat the membar cost. struct page is only 64 bytes, and the cutoff in the Niagara4 bzero implementation is "64 + (64 - 8)" so indeed the initializing stores will not even be used. So sparc64 will only use initializing stores and do the membars if at least 2 pages are cleared at a time. -- To unsubscribe from this list: send the line "unsubscribe linux-s390" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html