On Fri, Mar 17, 2017 at 10:21:58AM +0800, Aaron Lu wrote: > On Thu, Mar 16, 2017 at 02:38:44PM -0500, Alex Thorlton wrote: > > On Wed, Mar 15, 2017 at 04:59:59PM +0800, Aaron Lu wrote: > > > v2 changes: Nothing major, only minor ones. > > > - rebased on top of v4.11-rc2-mmotm-2017-03-14-15-41; > > > - use list_add_tail instead of list_add to add worker to tlb's worker > > > list so that when doing flush, the first queued worker gets flushed > > > first(based on the comsumption that the first queued worker has a > > > better chance of finishing its job than those later queued workers); > > > - use bool instead of int for variable free_batch_page in function > > > tlb_flush_mmu_free_batches; > > > - style change according to ./scripts/checkpatch; > > > - reword some of the changelogs to make it more readable. > > > > > > v1 is here: > > > https://lkml.org/lkml/2017/2/24/245 > > > > I tested v1 on a Haswell system with 64 sockets/1024 cores/2048 threads > > and 8TB of RAM, with a 1TB malloc. The average free() time for a 1TB > > malloc on a vanilla kernel was 41.69s, the patched kernel averaged > > 21.56s for the same test. > > Thanks a lot for the test result. > > > > > I am testing v2 now and will report back with results in the next day or > > so. > > Testing plain v2 shouldn't bring any surprise/difference You're right! Not much difference here. v2 averaged a 23.17s free time for a 1T allocation. > better set the > following param before the test(I'm planning to make them default in the > next version): > # echo 64 > /sys/devices/virtual/workqueue/batch_free_wq/max_active > # echo 1030 > /sys/kernel/debug/parallel_free/max_gather_batch_count 10 test runs with these params set averaged 22.22s to free 1T. So, we're still seeing a nearly 50% decrease in free time vs. the unpatched kernel. - Alex -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>