Re: [PATCH v3 1/3] lib/percpu-list: Per-cpu list with associated per-cpu locks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02/24/2016 02:56 AM, Jan Kara wrote:
On Tue 23-02-16 14:04:30, Waiman Long wrote:
Linked list is used everywhere in the Linux kernel. However, if many
threads are trying to add or delete entries into the same linked list,
it can create a performance bottleneck.

This patch introduces a new per-cpu list subystem with associated
per-cpu locks for protecting each of the lists individually. This
allows list entries insertion and deletion operations to happen in
parallel instead of being serialized with a global list and lock.

List entry insertion is strictly per cpu. List deletion, however, can
happen in a cpu other than the one that did the insertion. So we still
need lock to protect the list. Because of that, there may still be
a small amount of contention when deletion is being done.

A new header file include/linux/percpu-list.h will be added with the
associated pcpu_list_head and pcpu_list_node structures. The following
functions are provided to manage the per-cpu list:

  1. int init_pcpu_list_head(struct pcpu_list_head **ppcpu_head)
  2. void pcpu_list_add(struct pcpu_list_node *node,
		       struct pcpu_list_head *head)
  3. void pcpu_list_del(struct pcpu_list *node)

Iteration of all the list entries within a group of per-cpu
lists is done by calling either the pcpu_list_iterate() or
pcpu_list_iterate_safe() functions in a while loop. They correspond
to the list_for_each_entry() and list_for_each_entry_safe() macros
respectively. The iteration states are keep in a pcpu_list_state
structure that is passed to the iteration functions.

Signed-off-by: Waiman Long<Waiman.Long@xxxxxxx>
Two comments below.

+/*
+ * Helper function to find the first entry of the next per-cpu list
+ * It works somewhat like for_each_possible_cpu(cpu).
+ *
+ * Return: true if the entry is found, false if all the lists exhausted
+ */
+static __always_inline bool
+__pcpu_list_next_cpu(struct pcpu_list_head *head, struct pcpu_list_state *state)
+{
+	if (state->lock)
+		spin_unlock(state->lock);
+next_cpu:
+	/*
+	 * for_each_possible_cpu(cpu)
+	 */
+	state->cpu = cpumask_next(state->cpu, cpu_possible_mask);
+	if (state->cpu>= nr_cpu_ids)
+		return false;	/* All the per-cpu lists iterated */
+
+	state->head =&per_cpu_ptr(head, state->cpu)->list;
+	state->lock =&per_cpu_ptr(head, state->cpu)->lock;
+	state->curr = list_entry(state->head->next,
+				 struct pcpu_list_node, list);
+	if (&state->curr->list == state->head)
+		goto next_cpu;
This might be more comprehensible as:

	if (list_empty(state->head))
		goto next_cpu;

and you can do it just after updating state->head (no need to init
state->lock&  state->curr if the list is empty).

Thank for the suggestion. Will change the code accordingly.
Another note: Initialization of state->curr is IMO racy - you need to hold
state->lock to grab state->curr reliably, don't you? Otherwise someone can
remove the entry while you are working with it. So you need to move that
down just before returning.

Right. I will move the initialization of state->curr after the spin_lock().

+
+	spin_lock(state->lock);
+	return true;
+}
+#endif /* NR_CPUS == 1 */
...

+/*
+ * Delete a node from a percpu list
+ *
+ * We need to check the lock pointer again after taking the lock to guard
+ * against concurrent delete of the same node. If the lock pointer changes
+ * (becomes NULL or to a different one), we assume that the deletion was done
+ * elsewhere.
+ */
+void pcpu_list_del(struct pcpu_list_node *node)
+{
+	spinlock_t *lock = READ_ONCE(node->lockptr);
+
+	if (unlikely(!lock))
+		return;
+
+	spin_lock(lock);
+	if (likely(lock == node->lockptr)) {
+		list_del_init(&node->list);
+		node->lockptr = NULL;
+	}
But someone changing lockptr under your hands would mean that there are
two processes racing to remove entries and that would generally point to a
problem (and likely use-after-free) in the caller, won't it? Or do you have
some particular usecase in mind?

								Honza


This is just defensive programming to guard against unforeseen case. I don't have any particular use case in mind that will make that happen. Maybe I should put a WARN_ON if this really happens.

Cheers,
Longman

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux