On Wed, Oct 14, 2020 at 05:18:18PM +0100, Catalin Marinas wrote: > Hi Lorenzo, > > On Tue, Oct 13, 2020 at 04:41:00PM +0100, Lorenzo Pieralisi wrote: > > On Tue, Oct 13, 2020 at 03:42:07PM +0200, Ard Biesheuvel wrote: > > > On Tue, 13 Oct 2020 at 15:13, Lorenzo Pieralisi > > > <lorenzo.pieralisi@xxxxxxx> wrote: > > > > On Tue, Oct 13, 2020 at 01:22:32PM +0200, Ard Biesheuvel wrote: > > > > > > > diff --git a/Documentation/arm64/arm-acpi.rst b/Documentation/arm64/arm-acpi.rst > > > > > > > index 47ecb9930dde..947f5b5c45ef 100644 > > > > > > > --- a/Documentation/arm64/arm-acpi.rst > > > > > > > +++ b/Documentation/arm64/arm-acpi.rst > > > > > > > @@ -205,6 +205,13 @@ devices available. This list of tables is not meant to be all inclusive; > > > > > > > in some environments other tables may be needed (e.g., any of the APEI > > > > > > > tables from section 18) to support specific functionality. > > > > > > > > > > > > > > +It is assumed that all DMA capable devices in the system are able to > > > > > > > +access the lowest 4 GB of system memory. If this is not the case, an > > > > > > > +IORT describing those limitations is mandatory, even if an IORT is not > > > > > > > +otherwise necessary to describe the I/O topology, and regardless of > > > > > > > +whether _DMA methods are used to describe the DMA limitations more > > > > > > > +precisely. Once the system has booted, _DMA methods will take precedence > > > > > > > +over DMA addressing limits described in the IORT. > > > > > > > > > > > > If this is a boot requirement it must be in ARM's official documentation, > > > > > > first, not the kernel one. > > > > > > > > > > > > I understand this is an urgent (well - no comments on why bootstrapping > > > > > > ACPI on Raspberry PI4 is causing all this fuss, honestly) fix but that's > > > > > > not a reason to rush through these guidelines. > > > > > > > > > > > > I would not add this paragraph to arm-acpi.rst, yet. > > > > > > > > > > Which documentation? ACPI compliance by itself is not sufficient for a > > > > > system to be able to boot Linux/arm64, which is why we documented the > > > > > requirements for ACPI boot on Linux/arm64 in this file. I don't think > > > > > we need endorsement from ARM to decide that odd platforms like this > > > > > need to abide by some additional rules if they want to boot in ACPI > > > > > mode. > > > > > > > > I think we do - if we don't we should not add this documentation either. > > > > > > > > ACPI on ARM64 software stack is based on standardized HW requirements. > > > > The sheer fact that we need to work around a HW deficiency shows that > > > > either this platform should have never been booted with ACPI or the _HW_ > > > > design guidelines (BSA) are not tight enough. > > > > > > > > Please note that as you may have understood I asked if we can implement > > > > a workaround in IORT because that's information that must be there > > > > regardless (and an OEM ID match in arch code - though pragmatic - > > > > defeats the whole purpose), I don't think we should tell Linux kernel > > > > developers how firmware must be written to work around blatantly > > > > non-compliant systems. > > > > > > This is not about systems being compliant or not, unless there is a > > > requirement somewhere that I missed that all masters in the system > > > must be able to access at least 32 bits of DMA. > > > > I think there is in the SBSA (4.1.3 Memory Map) but regardless, this > > is clearly a design bug, that's not a feature. > > I think in revision D of the SBSA 6.0 this was moved to 3.1.3. Anyway, I > guess you are referring to: > > All Non-secure on-chip masters in a base server system that are > expected to be under the control of the operating system or hypervisor > must be capable of addressing all of the Non-secure address space. > > IIUC, this rules out 32-bit devices as well. Even if we look at the less > strict BSA (https://developer.arm.com/documentation/den0094/latest), the > requirements are the same. > > So we can consider the bug either in the *BSA specs or in hardware. If > at least the 32-bit devices are acceptable, the specs should probably be > updated. Otherwise we just state that Linux has slightly different > requirements than *BSA (more relaxed or tightened in various areas) and > we describe them somewhere under Documentation/arm64/. I agree with you completely and I think that the work should be done at spec level, I don't think that Linux should mandate how and when an IORT table should be provided and with what content, that's BSA+FW spec guidelines - that's my opinion at least. The <=32-bit devices story must be clarified in the BSA. IMO the issue that Raspberry PI4 highlighted (BSA dev <=32 bit addressing + _DMA vs IORT address limits + when an IORT table is mandatory) should be addressed at spec level, not in kernel docs, I don't want to go that way since this is HW+FW guidelines that must apply to all systems, hence my feedback on this patch. > > > The problem here is that Linux/arm64 cannot deal with fully compliant > > > systems that communicate their [permitted] DMA limitations via a _DMA > > > method if this limitation happens to be that the address limit < 32 > > > bits. The DMA subsystem can deal with this fine, only the default DMA > > > zone sizing policy creates an internal issue where the DMA subsystem > > > is not able to allocate memory that matches the DMA constraints. > > > > > > So the 'correct' fix here would be to rework the memory allocator so > > > it can deal with arbitrary DMA limits at allocation time, so that any > > > limit returned by a _DMA method can be adhered to on the fly. > > > > > > However, we all agree that the Raspberry Pi4 is not worth that effort, > > > and that in the general case, SoCs with such limitations, even if they > > > are compliant per the spec, are not worth the trouble of complicating > > > this even more. So as a compromise, I think it is perfectly reasonable > > > to require that systems that have such limitations communicate them > > > via the IORT, which we can parse early, regardless of whether _DMA > > > methods exist as well, and whether they return the same information. > > > > > > So this is not a requirement on arm64 ACPI systems in general. It is a > > > requirement that expresses that we, as arm64 > > > contributors/[co-]maintainers, are willing to cater for such systems > > > if they implement their firmware in a particular way. > > > > I don't think they should implement their firmware in any particular > > way, that's my point, I don't want them to in the first place. > > I haven't checked the *BBR specs, do they say anything about _DMA > methods or IORT tables for devices that can't access the full memory? No and that should be rectified. > > To start with there is no spec I am aware of that defines when/how to > > use _DMA vs IORT address limits, maybe we should spell that out better > > somewhere and that's useful regardless. > > I guess this answers my question above. Yes even if it is not a definitive answer unfortunately. > > My point is: either this workaround works with firmware written with > > guidelines valid for all arm64 systems (not as a special case: add an > > IORT table because we can't parse _DMA to workaround DMA address range > > shenanigans) or I am not willing to merge it - I prefer to add an OEM ID > > quirk and show what we are forced to do to make this work. > > If the guidelines don't say anything about <=32-bit device masks, we can > make up our requirements for Linux. Currently it happens to work on some > kernel versions because we have a ZONE_DMA of 1GB but prior to this > change it wouldn't have worked. We might as well disable ZONE_DMA and > ZONE_DMA32 altogether because the BSA specs state that there are no such > limited devices, but some SoCs would no longer boot. I don't think we can (or we should) do that - that ship has sailed and I am not sure we are in a position to enforce it. Again - that's a BSA discussion to be had. > I think all this depends on whether 32-bit devices are also special and > need quirks. If they do, can we add both 32-bit and <32-bit devices in > the same "special" pot and require IORT tables? I agree it's not nice to > ignore the _DMA method alternative during zone setup but we could still > check the latter at run-time at warn if smaller than the minimum > ZONE_DMA available. I think this whole discussion must be brought up at spec level and rectified. I totally appreciate urgency with this patch and I am happy to merge it without kernel docs (provided it does not trigger any regressions on existing systems - it should not but the only way to know that is by merging it and testing it on ACPI enabled HW). Thanks, Lorenzo