To clarify, local IRQ is disabled by calling raw_local_irq_save(flags); Function __lock_acquire double checks that the local IRQ is really disabled. On 2017-06-22 01:34 PM, axie wrote: > Hi Marek, > > Spin lock and spin unlock is fast. But it is not so fast compared with > atomic, which is a single CPU instruction in x86. > > > 1. spinlock does NOT allow preemption at local CPU. Let us have a look > at how spin lock was implemented. > > static inline void __raw_spin_lock(raw_spinlock_t *lock) > { > preempt_disable(); <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<--This is > memory barrier operation too. > spin_acquire(&lock->dep_map, 0, 0, _RET_IP_); > LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock); > } > > 2. A function __lock_acquire called by spinlock. The function is so > long that I would not attach all of it here. > > There is atomic operation inside and 12 meta data updates and 14 if > statements and it calls quite some other functions. > > Note that it disable IRQ... > > static int __lock_acquire(struct lockdep_map *lock, unsigned int > subclass, > int trylock, int read, int check, int hardirqs_off, > struct lockdep_map *nest_lock, unsigned long ip, > int references, int pin_count) > { > struct task_struct *curr = current; > struct lock_class *class = NULL; > struct held_lock *hlock; > unsigned int depth; > int chain_head = 0; > int class_idx; > u64 chain_key; > > if (unlikely(!debug_locks)) > return 0; > > /* > * Lockdep should run with IRQs disabled, otherwise we could > * get an interrupt which would want to take locks, which would > * end up in lockdep and have you got a head-ache already? > */ > if (DEBUG_LOCKS_WARN_ON(!irqs_disabled())) <<<<<<<<<<<<<<<Disable IRQ > return 0; > > .... > > 3. Another function called by spinlock in a higher level: > > void lock_acquire(struct lockdep_map *lock, unsigned int subclass, > > int trylock, int read, int check, > struct lockdep_map *nest_lock, unsigned long ip) > { > unsigned long flags; > > if (unlikely(current->lockdep_recursion)) > return; > > raw_local_irq_save(flags); > check_flags(flags); > > current->lockdep_recursion = 1; > trace_lock_acquire(lock, subclass, trylock, read, check, > nest_lock, ip); > __lock_acquire(lock, subclass, trylock, read, check, > irqs_disabled_flags(flags), nest_lock, ip, 0, 0); > current->lockdep_recursion = 0; > raw_local_irq_restore(flags); > } > > > Thanks, > > Alex Bin > > > On 2017-06-22 12:27 PM, Marek Olšák wrote: >> On Thu, Jun 22, 2017 at 5:33 PM, Xie, AlexBin <AlexBin.Xie at amd.com> >> wrote: >>> Hi Christian, >>> >>> >>> In fact, the change from spinlock to atomic is quite painful. When I >>> started, I thought it was easy but later I found there might be race >>> condition here and there. Now I think the change looks more robust. In >>> kernel source, there are several other drivers used the same trick. >>> >>> >>> On the other hand, I think the logic itself might be optimized >>> considering >>> the locking. But I had spent quite some effort to maintain original >>> logic. >> It seems quite complicated and I don't know if there is any >> performance benefit. Spinlocks are nice because they allow preemption. >> >> It would be more interesting to merge the CS and BO_LIST ioctls into >> one. >> >> Marek >