Re: [PATCH v1] dt-bindings: riscv: deprecate riscv,isa

Palmer Dabbelt <palmer@xxxxxxxxxxx> · Thu, 18 May 2023 07:41:17 -0700 (PDT)

On Thu, 18 May 2023 07:06:17 PDT (-0700), Conor Dooley wrote:
On Thu, May 18, 2023 at 07:13:15PM +0530, Anup Patel wrote:
On Thu, May 18, 2023 at 4:02 PM Andrew Jones <ajones@xxxxxxxxxxxxxxxx> wrote:
> On Thu, May 18, 2023 at 09:58:30AM +0100, Conor Dooley wrote:

> > -  riscv,isa:
> > -    description:
> > -      Identifies the specific RISC-V instruction set architecture
> > -      supported by the hart.  These are documented in the RISC-V
> > -      User-Level ISA document, available from
> > -      https://riscv.org/specifications/
> > -
> > -      Due to revisions of the ISA specification, some deviations
> > -      have arisen over time.
> > -      Notably, riscv,isa was defined prior to the creation of the
> > -      Zicsr and Zifencei extensions and thus "i" implies
> > -      "zicsr_zifencei".
> > -
> > -      While the isa strings in ISA specification are case
> > -      insensitive, letters in the riscv,isa string must be all
> > -      lowercase to simplify parsing.
> > -    $ref: "/schemas/types.yaml#/definitions/string"
> > -    pattern: ^rv(?:64|32)imaf?d?q?c?b?k?j?p?v?h?(?:[hsxz](?:[a-z])+)?(?:_[hsxz](?:[a-z])+)*$
> > -
> >    # RISC-V requires 'timebase-frequency' in /cpus, so disallow it here
> >    timebase-frequency: false
> >
> > @@ -133,8 +117,13 @@ properties:
> >        DMIPS/MHz, relative to highest capacity-dmips-mhz
> >        in the system.
> >
> > +oneOf:
> > +  - required:
> > +      - riscv,isa
>
> This is the part Anup keeps reminding me about. We can create better ways
> to handle extensions in DT and ACPI, but we'll still need to parse ISA
> strings to handle legacy DTs and holdouts that keep creating ISA strings,
> at least during the deprecation period, since ISA strings are still "the
> way to do it" according to the spec.

Coming up with an alternate way in DT is fine but we can't deprecate
ISA strings since ISA strings are widely used:
1) Various bootloaders

Aye, for the reason, as I mentioned earlier and in the RFC thread,
removing existing parsers isn't a good idea.

Removing and deprecating are different.  We can deprecate stuff.

2) It is part of /proc/cpuinfo

That is irrelevant.

3) Hypervisors use it to communicate HW features to Guest/VM.
Hypervisors can't get away from generating ISA strings because
Hypervisors don't know what is running inside Guest/VM.

Generate both :) As things stand, your guests could interpret what you
communicate to them via riscv,isa differently!

In the case of ACPI, it is a very different situation. Like Sunil mentioned,
ACPI will always follow mechanisms defined by RVI (such as ISA string).
Other ACPI approaches such as GUID for ISA extension are simply not
scalable and will take a lot more memory for ACPI tables compared to
ISA strings.

My proposal should actually suit ACPI, at least for Linux, as it would
be a chance to align currently misaligned definitions. I won't speak to
GUIDs or whatever as that's someone else's problem :)

We talked a bit in the patchwork meeting with Drew about ACPI.  Any 
actual spec/encoding would need to be different, of course, but 
conceptually it seems to fit fine.  It's also broadly similar to what 
we've done with riscv_hwprobe() for userspace, which is nice.

> Also, if we assume the wording in the spec does get shored up, then,
> unless I'm missing something, the list of advantages for this boolean
> proposal from your commit message would be

IMO, we should try our best to have the wordings changed in RVI spec.

Yes, doing so is beneficial for all of us regardless of what happens
here. I do think that it is partially orthogonal - it allows us to not
design an interface that needs to be capable of communicating a wide
variety of versions, but I don't think it solves some of the issues
that riscv,isa has. If I thought it did, I would not have gone to the
trouble of respinning this patch out of the other approach.

> * More character choices for name -- probably not a huge gain for ratified
>   extensions, since the boolean properties will likely still use the same
>   name as the ISA string (riscv,isa-extension-<name>). But, for vendor
>   extensions, this is indeed a major improvement, since vendor extension
>   boolean property names may need to be extended in unambiguous ways to
>   handle changes in the extension.
>
> * Simpler, more complete DT validation (but we still need a best effort
>   for legacy ISA strings)
>
> * Simpler DT parsing (but we still need the current parser for legacy ISA
>   strings)
>
> > +  - required:
> > +      - riscv,isa-base
> > +
> >  required:
> > -  - riscv,isa
> >    - interrupt-controller
> >
> >  additionalProperties: true
> > @@ -177,7 +166,13 @@ examples:
> >                  i-tlb-size = <32>;
> >                  mmu-type = "riscv,sv39";
> >                  reg = <1>;
> > -                riscv,isa = "rv64imafdc";
> > +                riscv,isa-base = "rv64i";
> > +                riscv,isa-extension-i;
> > +                riscv,isa-extension-m;
> > +                riscv,isa-extension-a;
> > +                riscv,isa-extension-f;
> > +                riscv,isa-extension-d;
> > +                riscv,isa-extension-c;

One downside of this new approach is it will increase the size of DTB.
Imaging 50 such DT properties in 46 CPU DT nodes.

I should do a comparison between 50 extensions in riscv,isa and doing
this 50 times and see what the sizes are.

I'm not sure how sensitive people are to DT size (presumably it'd be DTB 
size)?

It's also not clear what we can do about it: RISC-V has lots of 
extensions, that's going to take encoding space.  Sticking with an 
ambiguous encoding because it's smaller seems like a way to get 
burned in the long run.