Re: [RFC PATCH v1 00/57] Boot-time page size selection for arm64

Ryan Roberts <ryan.roberts@xxxxxxx> · Thu, 24 Oct 2024 11:34:18 +0100

On 22/10/2024 18:30, Neal Gompa wrote:

[...]

>>>>>>>>>>
>>>>>>>>>> This is a generally very exciting patch set! I'm looking forward to seeing it
>>>>>>>>>> land so I can take advantage of it for Fedora ARM and Fedora Asahi Remix.
>>>>>>>>>>
>>>>>>>>>> That said, I have a couple of questions:
>>>>>>>>>>
>>>>>>>>>> * Going forward, how would we handle drivers/modules that require a particular
>>>>>>>>>> page size? For example, the Apple Silicon IOMMU driver code requires the
>>>>>>>>>> kernel to operate in 16k page size mode, and it would need to be disabled in
>>>>>>>>>> other page sizes.
>>>>>>>>>
>>>>>>>>> I think these drivers would want to check PAGE_SIZE at probe time and fail if an
>>>>>>>>> unsupported page size is in use. Do you see any issue with that?
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> * How would we handle an invalid selection at boot?
>>>>>>>>>
>>>>>>>>> What do you mean by invalid here? The current policy validates that the
>>>>>>>>> requested page size is supported by the HW by checking mmfr0. If no page size is
>>>>>>>>> passed on the command line, or the passed value is not supported by the HW, then
>>>>>>>>> the we default to the largest page size supported by the HW (so for Apple
>>>>>>>>> Silicon that would be 16k since the HW doesn't support 64k). Although I think it
>>>>>>>>> may be better to change that policy to use the smallest page size in this case;
>>>>>>>>> 4k is the safer bet for compat and will waste much less memory than 64k.
>>>>>>>>>
>>>>>>>>>> Can we program in a
>>>>>>>>>> fallback when the "wrong" mode is selected for a chip or something similar?
>>>>>>>>>
>>>>>>>>> Do you mean effectively add a machanism to force 16k if the detected HW is Apple
>>>>>>>>> Silicon? The trouble is that we need to select the page size, very early in
>>>>>>>>> boot, before start_kernel() is called, so we really only have generic arch code
>>>>>>>>> and the command line with which to make the decision.
>>>>>>>>
>>>>>>>> Yes... I think a build-time CONFIG for default page size, which can be
>>>>>>>> overridden by a karg makes sense... Even on platforms like Apple
>>>>>>>> Silicon you may want to test very specific things in 4k by overriding
>>>>>>>> with a karg.
>>>>>>>
>>>>>>> Ahh, yes, that would certainly work. I'll work it into the next version.
>>>>>>>
>>>>>>
>>>>>> Could we maybe extend to have some kind of way to include a table of
>>>>>> SoC IDs that certain modes are disabled (e.g. 64k on Apple Silicon)
>>>>>
>>>>> 64k is already disabled on Apple Silicon because mmfr0 reports that 64k is not
>>>>> supported.
>>>>>
>>>>>> and preferred modes when no arg is set (16k for Apple Silicon)? That
>>>>>
>>>>> And it's not obvious that we should hard-code a page size preference to a SoC
>>>>> ID. If the CPU can support multiple page sizes, it should be up to the SW stack
>>>>> to decide, not the SoC.
>>>>>
>>>>> I'm guessing your desire is to have a single kernel build that will boot 16k by
>>>>> default on Apple Silicon and 4k by default on other systems, all without needing
>>>>> to modify the command line? Personally I think it's cleaner to just require
>>>>> setting the page size on the command line in these cases.
>>>>>
>>>>>> way it'd work something like this:
>>>>>>
>>>>>> 1. Table identification of 4/16/64 depending on identified SoC
>>>>> So I'd prefer not to have this
>>>>>
>>>>>> 2. Unidentified ones follow build-time default
>>>>>> 3. karg forces a mode regardless
>>>>> But keep these 2.
>>>>>
>>>>
>>> Since we are talking about Apple Silicon and page size, I would like to
>>> add that on the Apple Silicon SoCs I am working on, the situation is like
>>> this:
>>>
>>> Apple A7 (s5l8960x), A8 (T7000), A8X (T7001): CPU MMU support 4K and 64K
>>> page sizes.
>>>
>>> Apple A9 (s8000/s8003), A9X (s8001), A10 (t8010), A10X (t8011), A11 (t8015):
>>> CPU MMU Support 16K and 64K page sizes.
>>>
>>> However, all of them have 4K page DART IOMMUs.
>>>
>>>> I think it makes sense to have it, because it's not just Apple Silicon
>>>> where such a preference/requirement may be necessary. Apple Silicon
>>>> technically works at 4k, but is completely broken at 4k because Linux
>>>> cannot do 16k IOMMU with 4k everything else, so being able to at least
>>>> prefer 16k out of the box is important. And SoCs like the NVIDIA Grace
>>>> Hopper platform prefer 64k over other options (though I am unaware of
>>>> a gross incompatibility that effectively requires it like Apple
>>>> Silicon has).
>>>>
>>>> When we're trying to get to "single generic image that works
>>>> everywhere", stuff like this matters and I would really like you to
>>>> consider it from the lens of "we want things to work as automagic as
>>>> they do on x86".
>>> For me, in order to get to this level of automagic, there do need to be
>>> a table of which SoC should use which page size table.
>>
>> OK, but it's not clear to me that this table needs to be in the kernel. Could it
>> not be something in user space (e.g. during installation) that configures the
>> kernel command line?
>>
> 
> This is not compatible with using things like ISOs with UEFI+ACPI
> enabled desktop/server systems. We need to be able to safely,
> automatically, and correctly boot up and support hardware. The only
> place to do that early enough is in the kernel. But this can wait
> until the core stuff is in.

OK got it.

> 
>> Regardless, the hard work here is getting the boot-time page size selection
>> mechanism in place. Once that's there, follow up patches can add the desired
>> policy. I'd rather leave it out for now to avoid anything slowing down the core
>> work.
>>
> 
> Sure, this can be done afterward.

Thanks! I understand the problem a bit better now. I'm sure we can find a
solution once we have landed the core mechanism.

Thanks,
Ryan