[PATCH 1/3] drm/amdgpu: fix a typo

axie@xxxxxxx (axie) · Thu, 22 Jun 2017 14:19:36 -0400

To clarify, local IRQ is disabled by calling raw_local_irq_save(flags);

Function __lock_acquire double checks that the local IRQ is really disabled.

On 2017-06-22 01:34 PM, axie wrote:
> Hi Marek,
>
> Spin lock and spin unlock is fast. But it is not so fast compared with 
> atomic, which is a single CPU instruction in x86.
>
>
> 1. spinlock does NOT allow preemption at local CPU. Let us have a look 
> at how spin lock was implemented.
>
> static inline void __raw_spin_lock(raw_spinlock_t *lock)
> {
>     preempt_disable(); <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<--This is 
> memory barrier operation too.
>     spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
>     LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
> }
>
> 2.  A function  __lock_acquire called by spinlock. The function is so 
> long that I would not attach all of it here.
>
> There is atomic operation inside and 12 meta data updates and 14 if 
> statements and it calls quite some other functions.
>
> Note that it disable IRQ...
>
> static int __lock_acquire(struct lockdep_map *lock, unsigned int 
> subclass,
>               int trylock, int read, int check, int hardirqs_off,
>               struct lockdep_map *nest_lock, unsigned long ip,
>               int references, int pin_count)
> {
>     struct task_struct *curr = current;
>     struct lock_class *class = NULL;
>     struct held_lock *hlock;
>     unsigned int depth;
>     int chain_head = 0;
>     int class_idx;
>     u64 chain_key;
>
>     if (unlikely(!debug_locks))
>         return 0;
>
>     /*
>      * Lockdep should run with IRQs disabled, otherwise we could
>      * get an interrupt which would want to take locks, which would
>      * end up in lockdep and have you got a head-ache already?
>      */
>     if (DEBUG_LOCKS_WARN_ON(!irqs_disabled())) <<<<<<<<<<<<<<<Disable IRQ
>         return 0;
>
> ....
>
> 3. Another function called by spinlock in a higher level:
>
> void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
>
>               int trylock, int read, int check,
>               struct lockdep_map *nest_lock, unsigned long ip)
> {
>     unsigned long flags;
>
>     if (unlikely(current->lockdep_recursion))
>         return;
>
>     raw_local_irq_save(flags);
>     check_flags(flags);
>
>     current->lockdep_recursion = 1;
>     trace_lock_acquire(lock, subclass, trylock, read, check, 
> nest_lock, ip);
>     __lock_acquire(lock, subclass, trylock, read, check,
>                irqs_disabled_flags(flags), nest_lock, ip, 0, 0);
>     current->lockdep_recursion = 0;
>     raw_local_irq_restore(flags);
> }
>
>
> Thanks,
>
> Alex Bin
>
>
> On 2017-06-22 12:27 PM, Marek OlÅ¡Ã¡k wrote:
>> On Thu, Jun 22, 2017 at 5:33 PM, Xie, AlexBin <AlexBin.Xie at amd.com> 
>> wrote:
>>> Hi Christian,
>>>
>>>
>>> In fact, the change from spinlock to atomic is quite painful. When I
>>> started, I thought it was easy but later I found there might be race
>>> condition here and there. Now I think the change looks more robust. In
>>> kernel source, there are several other drivers used the same trick.
>>>
>>>
>>> On the other hand, I think the logic itself might be optimized 
>>> considering
>>> the locking. But I had spent quite some effort to maintain original 
>>> logic.
>> It seems quite complicated and I don't know if there is any
>> performance benefit. Spinlocks are nice because they allow preemption.
>>
>> It would be more interesting to merge the CS and BO_LIST ioctls into 
>> one.
>>
>> Marek
>