On Fri, Dec 21, 2018 at 01:37:38AM +0000, Christopher Lameter wrote: >On Thu, 20 Dec 2018, Andrew Morton wrote: > >> The result of (get_partial_count / get_partial_try_count): >> >> +----------+----------------+------------+-------------+ >> | | Base | Patched | Improvement| >> +----------+----------------+------------+-------------+ >> |One Node | 1:3 | 1:0 | - 100% | > >If you have one node then you already searched all your slabs. So we could >completely skip the get_any_partial() functionality in the non NUMA case >(if nr_node_ids == 1) > > >> +----------+----------------+------------+-------------+ >> |Four Nodes| 1:5.8 | 1:2.5 | - 56% | >> +----------+----------------+------------+-------------+ > >Hmm.... Ok but that is the extreme slowpath. > >> Each version/system configuration combination has four round kernel >> build tests. Take the average result of real to compare. >> >> +----------+----------------+------------+-------------+ >> | | Base | Patched | Improvement| >> +----------+----------------+------------+-------------+ >> |One Node | 4m41s | 4m32s | - 4.47% | >> +----------+----------------+------------+-------------+ >> |Four Nodes| 4m45s | 4m39s | - 2.92% | >> +----------+----------------+------------+-------------+ > >3% on the four node case? That means that the slowpath is taken >frequently. Wonder why? > >Can we also see the variability? Since this is a NUMA system there is >bound to be some indeterminism in those numbers. Hmm... I rebuilt the kernel and try the experiment again, but found I can't reproduce this statistics. The data show it is worse than base line and shakes heavily... Base Patched real 5m49.652s real 8m9.515s user 19m0.581s user 17m30.296s sys 2m31.906s sys 2m21.445s real 5m47.145s real 6m47.437s user 19m17.445s user 18m33.461s sys 2m41.931s sys 2m43.249s real 7m2.043s real 5m38.539s user 18m11.723s user 19m40.552s sys 2m46.443s sys 2m43.771s real 5m31.797s real 12m59.936s user 19m13.984s user 15m47.602s sys 2m34.727s sys 2m20.385s -- Wei Yang Help you, Help me