Hello, On Mon, Jul 11, 2016 at 01:32:06PM -0400, Waiman Long wrote: ... > A new header file include/linux/dlock-list.h will be added with the Heh, I think perpcu_list was the better name but suppose I'm too late. > associated dlock_list_head and dlock_list_node structures. The following > functions are provided to manage the per-cpu list: > > 1. int init_dlock_list_head(struct dlock_list_head **pdlock_head) > 2. void dlock_list_add(struct dlock_list_node *node, > struct dlock_list_head *head) > 3. void dlock_list_del(struct dlock_list *node) > > Iteration of all the list entries within a group of per-cpu > lists is done by calling either the dlock_list_iterate() or > dlock_list_iterate_safe() functions in a while loop. They correspond > to the list_for_each_entry() and list_for_each_entry_safe() macros > respectively. The iteration states are keep in a dlock_list_state > structure that is passed to the iteration functions. Why do we need two variants of this? We need a state variable to walk the list anyway. Why not make dlock_list_iterate() safe against removal and get rid of the second variant? Also, dlock_list_next() probably is a better name. > +/* > + * include/linux/dlock-list.h > + * > + * A distributed (per-cpu) set of lists each of which is protected by its > + * own spinlock, but acts like a single consolidated list to the callers. > + * > + * The dlock_list_head structure contains the spinlock, the other > + * dlock_list_node structures only contains a pointer to the spinlock in > + * dlock_list_head. > + */ > +struct dlock_list_head { > + struct list_head list; > + spinlock_t lock; > +}; > + > +#define DLOCK_LIST_HEAD_INIT(name) \ > + { \ > + .list.prev = &name.list, \ > + .list.next = &name.list, \ > + .list.lock = __SPIN_LOCK_UNLOCKED(name), \ > + } This is confusing. It looks like dlock_list_head and DLOCK_LIST_HEAD_INIT() can be used to define and initialize static dlock_lists but that isn't true. It's weird to require the user to deal with percpu declaration of the data type. Shouldn't it be more like the following? struct dlock_list_head_cpu { struct list_head list; spinlock_t lock; }; struct dlock_list_head { struct dlock_list_head_percpu *head_cpu; }; > +/* > + * Per-cpu list iteration state > + */ > +struct dlock_list_state { > + int cpu; > + spinlock_t *lock; > + struct list_head *head; /* List head of current per-cpu list */ > + struct dlock_list_node *curr; > + struct dlock_list_node *next; > +}; Maybe dlock_list_iter[ator] is a better name? > +#define DLOCK_LIST_STATE_INIT() \ > + { \ > + .cpu = -1, \ > + .lock = NULL, \ > + .head = NULL, \ > + .curr = NULL, \ > + .next = NULL, \ > + } The NULL inits are unnecessary and prone to being left behind. > +#define DEFINE_DLOCK_LIST_STATE(s) \ > + struct dlock_list_state s = DLOCK_LIST_STATE_INIT() > + > +static inline void init_dlock_list_state(struct dlock_list_state *state) > +{ > + state->cpu = -1; > + state->lock = NULL; > + state->head = NULL; > + state->curr = NULL; > + state->next = NULL; > +} Why not "state = (struct dlock_list_state)DLOCK_LIST_STATE_INIT;"? > +#ifdef CONFIG_DEBUG_SPINLOCK > +#define DLOCK_LIST_WARN_ON(x) WARN_ON(x) > +#else > +#define DLOCK_LIST_WARN_ON(x) > +#endif I'd just use WARN_ON_ONCE() without the CONFIG guard. > +/* > + * Next per-cpu list entry > + */ > +#define dlock_list_next_entry(pos, member) list_next_entry(pos, member.list) Why does this need to be exposed? > +/* > + * Per-cpu node data structure > + */ > +struct dlock_list_node { > + struct list_head list; > + spinlock_t *lockptr; > +}; > + > +#define DLOCK_LIST_NODE_INIT(name) \ > + { \ > + .list.prev = &name.list, \ > + .list.next = &name.list, \ > + .list.lockptr = NULL \ > + } Ditto with NULL init. > +static inline void init_dlock_list_node(struct dlock_list_node *node) > +{ > + INIT_LIST_HEAD(&node->list); > + node->lockptr = NULL; > +} Ditto with init. > +static inline void free_dlock_list_head(struct dlock_list_head **pdlock_head) > +{ > + free_percpu(*pdlock_head); > + *pdlock_head = NULL; > +} Why does this need to be inlined? > +/* > + * Check if all the per-cpu lists are empty > + */ Please use proper function comments. > +static inline bool dlock_list_empty(struct dlock_list_head *dlock_head) > +{ > + int cpu; > + > + for_each_possible_cpu(cpu) > + if (!list_empty(&per_cpu_ptr(dlock_head, cpu)->list)) > + return false; > + return true; > +} > + > +/* > + * Helper function to find the first entry of the next per-cpu list > + * It works somewhat like for_each_possible_cpu(cpu). > + * > + * Return: true if the entry is found, false if all the lists exhausted Ditto about the comment. > + */ > +static __always_inline bool > +__dlock_list_next_cpu(struct dlock_list_head *head, > + struct dlock_list_state *state) > +{ ... > +static inline bool dlock_list_iterate_safe(struct dlock_list_head *head, > + struct dlock_list_state *state) > +{ Inlining these doesn't make senes to me. > diff --git a/lib/Makefile b/lib/Makefile > index 499fb35..92e8c38 100644 > --- a/lib/Makefile > +++ b/lib/Makefile > +/* > + * The dlock list lock needs its own class to avoid warning and stack > + * trace when lockdep is enabled. > + */ Can you please elaborate on this? > +static struct lock_class_key dlock_list_key; > + > +/* > + * Initialize the per-cpu list head > + */ > +int init_dlock_list_head(struct dlock_list_head **pdlock_head) > +{ > + struct dlock_list_head *dlock_head; > + int cpu; > + > + dlock_head = alloc_percpu(struct dlock_list_head); > + if (!dlock_head) > + return -ENOMEM; > + > + for_each_possible_cpu(cpu) { > + struct dlock_list_head *head = per_cpu_ptr(dlock_head, cpu); > + > + INIT_LIST_HEAD(&head->list); > + head->lock = __SPIN_LOCK_UNLOCKED(&head->lock); > + lockdep_set_class(&head->lock, &dlock_list_key); > + } > + > + *pdlock_head = dlock_head; > + return 0; > +} Why is this called init? Why not do the following? struct dlock_list_head *alloc_dlock_list_head(void); Also, the pointer type needs to include __percpu annotation. > +/* > + * List selection is based on the CPU being used when the dlock_list_add() > + * function is called. However, deletion may be done by a different CPU. > + * So we still need to use a lock to protect the content of the list. > + */ > +void dlock_list_add(struct dlock_list_node *node, struct dlock_list_head *head) > +{ > + struct dlock_list_head *myhead; > + > + /* > + * Disable preemption to make sure that CPU won't gets changed. > + */ > + myhead = get_cpu_ptr(head); > + spin_lock(&myhead->lock); > + node->lockptr = &myhead->lock; > + list_add(&node->list, &myhead->list); > + spin_unlock(&myhead->lock); > + put_cpu_ptr(head); > +} I wonder whether it'd be better to use irqsafe operations. lists tend to be often used from irq contexts. > +/* > + * Delete a node from a dlock list > + * > + * We need to check the lock pointer again after taking the lock to guard > + * against concurrent delete of the same node. If the lock pointer changes > + * (becomes NULL or to a different one), we assume that the deletion was done > + * elsewhere. > + */ > +void dlock_list_del(struct dlock_list_node *node) > +{ > + spinlock_t *lock = READ_ONCE(node->lockptr); > + > + if (unlikely(!lock)) { > + WARN(1, "dlock_list_del: node 0x%lx has no associated lock\n", > + (unsigned long)node); > + return; > + } > + > + spin_lock(lock); > + if (likely(lock == node->lockptr)) { > + list_del_init(&node->list); > + node->lockptr = NULL; > + } else { > + /* > + * This path should never be executed. > + */ What if it races against someone else removing and adding back? Shouldn't it retry on those cases? Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html