On Thu 05-10-17 14:43:26, Waiman Long wrote: > The dlock list needs one list for each of the CPUs available. However, > for sibling CPUs, they are sharing the L2 and probably L1 caches > too. As a result, there is not much to gain in term of avoiding > cacheline contention while increasing the cacheline footprint of the > L1/L2 caches as separate lists may need to be in the cache. > > This patch makes all the sibling CPUs share the same list, thus > reducing the number of lists that need to be maintained in each > dlock list without having any noticeable impact on performance. It > also improves dlock list iteration performance as fewer lists need > to be iterated. > > Signed-off-by: Waiman Long <longman@xxxxxxxxxx> ... > @@ -118,7 +156,7 @@ bool dlock_lists_empty(struct dlock_list_heads *dlist) > { > int idx; > > - for (idx = 0; idx < nr_cpu_ids; idx++) > + for (idx = 0; idx < nr_dlock_lists; idx++) > if (!list_empty(&dlist->heads[idx].list)) > return false; > return true; > @@ -207,7 +245,7 @@ struct dlock_list_node *__dlock_list_next_list(struct dlock_list_iter *iter) > /* > * Try next list > */ > - if (++iter->index >= nr_cpu_ids) > + if (++iter->index >= nr_dlock_lists) > return NULL; /* All the entries iterated */ > > if (list_empty(&iter->head[iter->index].list)) Why these two do not need a similar treatment as alloc_dlist_heads()? Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR