Patch "LoongArch: Define the __io_aw() hook as mmiowb()" has been added to the 6.6-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    LoongArch: Define the __io_aw() hook as mmiowb()

to the 6.6-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     loongarch-define-the-__io_aw-hook-as-mmiowb.patch
and it can be found in the queue-6.6 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 2b62f6e20369bf49c95859d693b080c8b7d1b4b1
Author: Huacai Chen <chenhuacai@xxxxxxxxxx>
Date:   Tue Mar 19 15:50:34 2024 +0800

    LoongArch: Define the __io_aw() hook as mmiowb()
    
    [ Upstream commit 9c68ece8b2a5c5ff9b2fcaea923dd73efeb174cd ]
    
    Commit fb24ea52f78e0d595852e ("drivers: Remove explicit invocations of
    mmiowb()") remove all mmiowb() in drivers, but it says:
    
    "NOTE: mmiowb() has only ever guaranteed ordering in conjunction with
    spin_unlock(). However, pairing each mmiowb() removal in this patch with
    the corresponding call to spin_unlock() is not at all trivial, so there
    is a small chance that this change may regress any drivers incorrectly
    relying on mmiowb() to order MMIO writes between CPUs using lock-free
    synchronisation."
    
    The mmio in radeon_ring_commit() is protected by a mutex rather than a
    spinlock, but in the mutex fastpath it behaves similar to spinlock. We
    can add mmiowb() calls in the radeon driver but the maintainer says he
    doesn't like such a workaround, and radeon is not the only example of
    mutex protected mmio.
    
    So we should extend the mmiowb tracking system from spinlock to mutex,
    and maybe other locking primitives. This is not easy and error prone, so
    we solve it in the architectural code, by simply defining the __io_aw()
    hook as mmiowb(). And we no longer need to override queued_spin_unlock()
    so use the generic definition.
    
    Without this, we get such an error when run 'glxgears' on weak ordering
    architectures such as LoongArch:
    
    radeon 0000:04:00.0: ring 0 stalled for more than 10324msec
    radeon 0000:04:00.0: ring 3 stalled for more than 10240msec
    radeon 0000:04:00.0: GPU lockup (current fence id 0x000000000001f412 last fence id 0x000000000001f414 on ring 3)
    radeon 0000:04:00.0: GPU lockup (current fence id 0x000000000000f940 last fence id 0x000000000000f941 on ring 0)
    radeon 0000:04:00.0: scheduling IB failed (-35).
    [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
    radeon 0000:04:00.0: scheduling IB failed (-35).
    [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
    radeon 0000:04:00.0: scheduling IB failed (-35).
    [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
    radeon 0000:04:00.0: scheduling IB failed (-35).
    [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
    radeon 0000:04:00.0: scheduling IB failed (-35).
    [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
    radeon 0000:04:00.0: scheduling IB failed (-35).
    [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
    radeon 0000:04:00.0: scheduling IB failed (-35).
    [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
    
    Link: https://lore.kernel.org/dri-devel/29df7e26-d7a8-4f67-b988-44353c4270ac@xxxxxxx/T/#t
    Link: https://lore.kernel.org/linux-arch/20240301130532.3953167-1-chenhuacai@xxxxxxxxxxx/T/#t
    Cc: stable@xxxxxxxxxxxxxxx
    Signed-off-by: Huacai Chen <chenhuacai@xxxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/arch/loongarch/include/asm/Kbuild b/arch/loongarch/include/asm/Kbuild
index 93783fa24f6e9..dede0b422cfb9 100644
--- a/arch/loongarch/include/asm/Kbuild
+++ b/arch/loongarch/include/asm/Kbuild
@@ -4,6 +4,7 @@ generic-y += mcs_spinlock.h
 generic-y += parport.h
 generic-y += early_ioremap.h
 generic-y += qrwlock.h
+generic-y += qspinlock.h
 generic-y += rwsem.h
 generic-y += segment.h
 generic-y += user.h
diff --git a/arch/loongarch/include/asm/io.h b/arch/loongarch/include/asm/io.h
index c486c2341b662..4a8adcca329b8 100644
--- a/arch/loongarch/include/asm/io.h
+++ b/arch/loongarch/include/asm/io.h
@@ -71,6 +71,8 @@ extern void __memcpy_fromio(void *to, const volatile void __iomem *from, size_t
 #define memcpy_fromio(a, c, l) __memcpy_fromio((a), (c), (l))
 #define memcpy_toio(c, a, l)   __memcpy_toio((c), (a), (l))
 
+#define __io_aw() mmiowb()
+
 #include <asm-generic/io.h>
 
 #define ARCH_HAS_VALID_PHYS_ADDR_RANGE
diff --git a/arch/loongarch/include/asm/qspinlock.h b/arch/loongarch/include/asm/qspinlock.h
deleted file mode 100644
index 34f43f8ad5912..0000000000000
--- a/arch/loongarch/include/asm/qspinlock.h
+++ /dev/null
@@ -1,18 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _ASM_QSPINLOCK_H
-#define _ASM_QSPINLOCK_H
-
-#include <asm-generic/qspinlock_types.h>
-
-#define queued_spin_unlock queued_spin_unlock
-
-static inline void queued_spin_unlock(struct qspinlock *lock)
-{
-	compiletime_assert_atomic_type(lock->locked);
-	c_sync();
-	WRITE_ONCE(lock->locked, 0);
-}
-
-#include <asm-generic/qspinlock.h>
-
-#endif /* _ASM_QSPINLOCK_H */




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux