在 2021/10/16 上午3:58, Alexei Starovoitov 写道: > On Fri, Oct 15, 2021 at 11:04 AM Chengming Zhou > <zhouchengming@xxxxxxxxxxxxx> wrote: >> >> We only use count for kmalloc hashtab not for prealloc hashtab, because >> __pcpu_freelist_pop() return NULL when no more elem in pcpu freelist. >> >> But the problem is that __pcpu_freelist_pop() will traverse all CPUs and >> spin_lock for all CPUs to find there is no more elem at last. >> >> We encountered bad case on big system with 96 CPUs that alloc_htab_elem() >> would last for 1ms. This patch use count for prealloc hashtab too, >> avoid traverse and spin_lock for all CPUs in this case. >> >> Signed-off-by: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx> > > It's not clear from the commit log what you're solving. > The atomic inc/dec in critical path of prealloc maps hurts performance. > That's why it's not used. > Thanks for the explanation, what I'm solving is when hash table hasn't free elements, we don't need to call __pcpu_freelist_pop() to traverse and spin_lock all CPUs. The ftrace output of this bad case is below: 50) | htab_map_update_elem() { 50) 0.329 us | _raw_spin_lock_irqsave(); 50) 0.063 us | lookup_elem_raw(); 50) | alloc_htab_elem() { 50) | pcpu_freelist_pop() { 50) 0.209 us | _raw_spin_lock(); 50) 0.264 us | _raw_spin_lock(); 50) 0.231 us | _raw_spin_lock(); 50) 0.168 us | _raw_spin_lock(); 50) 0.168 us | _raw_spin_lock(); 50) 0.300 us | _raw_spin_lock(); 50) 0.263 us | _raw_spin_lock(); 50) 0.304 us | _raw_spin_lock(); 50) 0.168 us | _raw_spin_lock(); 50) 0.177 us | _raw_spin_lock(); 50) 0.235 us | _raw_spin_lock(); 50) 0.162 us | _raw_spin_lock(); 50) 0.186 us | _raw_spin_lock(); 50) 0.185 us | _raw_spin_lock(); 50) 0.315 us | _raw_spin_lock(); 50) 0.172 us | _raw_spin_lock(); 50) 0.180 us | _raw_spin_lock(); 50) 0.173 us | _raw_spin_lock(); 50) 0.176 us | _raw_spin_lock(); 50) 0.261 us | _raw_spin_lock(); 50) 0.364 us | _raw_spin_lock(); 50) 0.180 us | _raw_spin_lock(); 50) 0.284 us | _raw_spin_lock(); 50) 0.226 us | _raw_spin_lock(); 50) 0.210 us | _raw_spin_lock(); 50) 0.237 us | _raw_spin_lock(); 50) 0.333 us | _raw_spin_lock(); 50) 0.295 us | _raw_spin_lock(); 50) 0.278 us | _raw_spin_lock(); 50) 0.260 us | _raw_spin_lock(); 50) 0.224 us | _raw_spin_lock(); 50) 0.447 us | _raw_spin_lock(); 50) 0.221 us | _raw_spin_lock(); 50) 0.320 us | _raw_spin_lock(); 50) 0.203 us | _raw_spin_lock(); 50) 0.213 us | _raw_spin_lock(); 50) 0.242 us | _raw_spin_lock(); 50) 0.230 us | _raw_spin_lock(); 50) 0.216 us | _raw_spin_lock(); 50) 0.525 us | _raw_spin_lock(); 50) 0.257 us | _raw_spin_lock(); 50) 0.235 us | _raw_spin_lock(); 50) 0.269 us | _raw_spin_lock(); 50) 0.368 us | _raw_spin_lock(); 50) 0.249 us | _raw_spin_lock(); 50) 0.217 us | _raw_spin_lock(); 50) 0.174 us | _raw_spin_lock(); 50) 0.173 us | _raw_spin_lock(); 50) 0.161 us | _raw_spin_lock(); 50) 0.282 us | _raw_spin_lock(); 50) 0.264 us | _raw_spin_lock(); 50) 0.160 us | _raw_spin_lock(); 50) 0.692 us | _raw_spin_lock(); 50) 0.185 us | _raw_spin_lock(); 50) 0.157 us | _raw_spin_lock(); 50) 0.168 us | _raw_spin_lock(); 50) 0.205 us | _raw_spin_lock(); 50) 0.189 us | _raw_spin_lock(); 50) 0.276 us | _raw_spin_lock(); 50) 0.171 us | _raw_spin_lock(); 50) 0.390 us | _raw_spin_lock(); 50) 0.164 us | _raw_spin_lock(); 50) 0.170 us | _raw_spin_lock(); 50) 0.188 us | _raw_spin_lock(); 50) 0.284 us | _raw_spin_lock(); 50) 0.191 us | _raw_spin_lock(); 50) 0.412 us | _raw_spin_lock(); 50) 0.285 us | _raw_spin_lock(); 50) 0.296 us | _raw_spin_lock(); 50) 0.315 us | _raw_spin_lock(); 50) 0.239 us | _raw_spin_lock(); 50) 0.225 us | _raw_spin_lock(); 50) 0.258 us | _raw_spin_lock(); 50) 0.228 us | _raw_spin_lock(); 50) 0.240 us | _raw_spin_lock(); 50) 0.297 us | _raw_spin_lock(); 50) 0.216 us | _raw_spin_lock(); 50) 0.213 us | _raw_spin_lock(); 50) 0.225 us | _raw_spin_lock(); 50) 0.223 us | _raw_spin_lock(); 50) 0.287 us | _raw_spin_lock(); 50) 0.258 us | _raw_spin_lock(); 50) 0.295 us | _raw_spin_lock(); 50) 0.262 us | _raw_spin_lock(); 50) 0.325 us | _raw_spin_lock(); 50) 0.203 us | _raw_spin_lock(); 50) 0.325 us | _raw_spin_lock(); 50) 0.255 us | _raw_spin_lock(); 50) 0.325 us | _raw_spin_lock(); 50) 0.216 us | _raw_spin_lock(); 50) 0.232 us | _raw_spin_lock(); 50) 0.804 us | _raw_spin_lock(); 50) 0.262 us | _raw_spin_lock(); 50) 0.242 us | _raw_spin_lock(); 50) 0.271 us | _raw_spin_lock(); 50) 0.175 us | _raw_spin_lock(); 50) + 61.026 us | } 50) + 61.575 us | } 50) 0.051 us | _raw_spin_unlock_irqrestore(); 50) + 64.863 us | }