Re: [PATCH] arm64/io: Don't use WZR in writel

Marc Gonzalez <marc.w.gonzalez@xxxxxxx> · Tue, 12 Mar 2019 13:36:25 +0100

On 24/02/2019 04:53, Bjorn Andersson wrote:

> On Sat 23 Feb 10:37 PST 2019, Marc Zyngier wrote:
> 
>> On Sat, 23 Feb 2019 18:12:54 +0000, Bjorn Andersson wrote:
>>>
>>> On Mon 11 Feb 06:59 PST 2019, Marc Zyngier wrote:
>>>
>>>> On 11/02/2019 14:29, AngeloGioacchino Del Regno wrote:
>>>>
>>>>> Also, just one more thing: yes this thing is going ARM64-wide and
>>>>> - from my findings - it's targeting certain Qualcomm SoCs, but...
>>>>> I'm not sure that only QC is affected by that, others may as well
>>>>> have the same stupid bug.
>>>>
>>>> At the moment, only QC SoCs seem to be affected, probably because
>>>> everyone else has debugged their hypervisor (or most likely doesn't
>>>> bother with shipping one).
>>>>
>>>> In all honesty, we need some information from QC here: which SoCs are
>>>> affected, what is the exact nature of the bug, can it be triggered from
>>>> EL0. Randomly papering over symptoms is not something I really like
>>>> doing, and is likely to generate problems on unaffected systems.
>>>
>>> The bug at hand is that the XZR is not deemed a valid source in the
>>> virtualization of the SMMU registers. It was identified and fixed for
>>> all platforms that are shipping kernels based on v4.9 or later.
>>
>> When you say "fixed": Do you mean fixed in the firmware? Or by adding
>> a workaround in the shipped kernel?
> 
> I mean that it's fixed in the firmware.
> 
>> If the former, is this part of an official QC statement, with an
>> associated erratum number?
> 
> I don't know, will get back to you on this one.
> 
>> Is this really limited to the SMMU accesses?
> 
> Yes.
> 
>>> As such Angelo's list of affected platforms covers the high-profile
>>> ones. In particular MSM8996 and MSM8998 is getting pretty good support
>>> upstream, if we can figure out a way around this issue.
>> 
>> We'd need an exhaustive list of the affected SoCs, and work out if we
>> can limit the hack to the SMMU driver (cc'ing Robin, who's the one
>> who'd know about it).
> 
> I will try to compose a list.

FWIW, I have just been bitten by this issue. I needed to enable an SMMU to
filter PCIe EP accesses to system RAM (or something). I'm using an APQ8098
MEDIABOX dev board. My system hangs in arm_smmu_device_reset() doing:

	/* Invalidate the TLB, just in case */
	writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLH);
	writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLNSNH);


With the 'Z' constraint, gcc generates:

	str wzr, [x0]

without the 'Z' constraint, gcc generates:

	mov	w1, 0
	str w1, [x0]


I can work around the problem using the following patch:

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 045d93884164..93117519aed8 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -59,6 +59,11 @@
 
 #include "arm-smmu-regs.h"
 
+static inline void qcom_writel(u32 val, volatile void __iomem *addr)
+{
+	asm volatile("str %w0, [%1]" : : "r" (val), "r" (addr));
+}
+
 #define ARM_MMU500_ACTLR_CPRE		(1 << 1)
 
 #define ARM_MMU500_ACR_CACHE_LOCK	(1 << 26)
@@ -422,7 +427,7 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu,
 {
 	unsigned int spin_cnt, delay;
 
-	writel_relaxed(0, sync);
+	qcom_writel(0, sync);
 	for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) {
 		for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) {
 			if (!(readl_relaxed(status) & sTLBGSTATUS_GSACTIVE))
@@ -1760,8 +1765,8 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu)
 	}
 
 	/* Invalidate the TLB, just in case */
-	writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLH);
-	writel_relaxed(0, gr0_base + ARM_SMMU_GR0_TLBIALLNSNH);
+	qcom_writel(0, gr0_base + ARM_SMMU_GR0_TLBIALLH);
+	qcom_writel(0, gr0_base + ARM_SMMU_GR0_TLBIALLNSNH);
 
 	reg = readl_relaxed(ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);
 



Can a quirk be used to work around the issue?
Or can we just "pessimize" the 3 writes for everybody?
(Might be cheaper than a test anyway)

Regards.