On Mon, Aug 15, 2022 at 07:18:02PM +0000, Conor.Dooley@xxxxxxxxxxxxx wrote: > Any takers on trashing my regex? Otherwise I'll just submit > a v2 with the regex and it can be shat on there instead :) > > On 09/08/2022 19:36, Conor Dooley wrote: > > On 09/08/2022 15:14, Rob Herring wrote: > >> On Mon, Aug 08, 2022 at 10:01:11PM +0000, Conor.Dooley@xxxxxxxxxxxxx wrote: > >>> On 08/08/2022 22:34, Jessica Clarke wrote: > >>>> On Fri, Aug 05, 2022 at 05:28:42PM +0100, Conor Dooley wrote: > >>>>> From: Conor Dooley <conor.dooley@xxxxxxxxxxxxx> > >>>>> The final patch adds some new ISA strings > >>>>> which needs scruitiny from someone with more knowledge about what ISA > >>>>> extension strings should be reported in a dt than I have. > >>>> > >>>> Listing every possible ISA string supported by the Linux kernel really > >>>> is not going to scale... > >> > >> How does the kernel scale? (No need to answer) > >> > >>> Yeah, totally correct there. Case for adding a regex I suppose, but I > >>> am not sure how to go about handling the multi-letter extensions or > >>> if parsing them is required from a binding compliance point of view. > >>> Hoping for some input from Palmer really. > >> > >> Yeah, looks like a regex pattern is needed. > > > > I started pottering away at this but I have arrived at: > > rv64imaf?d?c?h?(_z[imafdqcbvkh]([a-z])*)*$ Don't forget the ^ at the start. Do we need to worry about optional major and minor version numbers? Or check that Z names have at least one character following the category character? Actually, the first letter after Z being a category is only a convention. Maybe we don't want to enforce that. What about X extensions? Thanks, drew > > > > I suspect that before "h?" there should be more single letter > > extensions added for completeness sake. So then it'd bloat out to: > > rv64imaf?d?q?c?b?v?k?h?(_z[imafdqcbvkh]([a-z])*)*$ > > > > I checked a couple different "bad" isa strings against it and > > nothing went up in flames but my regex skills are far from great > > so I'm sure there's better ways to represent this. > > > > Anyways, this pattern is based on my understanding that: > > - the single letter order is fixed & we don't care about things that > > can't even do "ima" > > - the multi letter extensions are all in a "_z<foo>" format where the > > first letter of <foo> is a valid single letter extension > > - we don't care about the e extension from an OS PoV (this could be a > > very flawed take...) > > - after the first two chars, the extension name could be an english > > word (ifencei anyone?) so it's not worth restricting the charset > > - that attempting to validate the contents of the multiletter extensions > > with dt-validate beyond the formatting is a futile, massively verbose > > or unwieldy exercise at best > > > > Some or all of those assumptions could be very very wrong so if {someone, > > anyone} wants to correct me - feel ***more*** than free.. > > > > Thanks, > > Conor. > > > > patch would then look like: > > > > diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml b/Documentation/devicetree/bindings/riscv/cpus.yaml > > index d632ac76532e..1e54e7746190 100644 > > --- a/Documentation/devicetree/bindings/riscv/cpus.yaml > > +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml > > @@ -74,9 +74,7 @@ properties: > > insensitive, letters in the riscv,isa string must be all > > lowercase to simplify parsing. > > $ref: "/schemas/types.yaml#/definitions/string" > > - enum: > > - - rv64imac > > - - rv64imafdc > > + pattern: rv64imaf?d?q?c?b?v?k?h?(_z[imafdqcbvkh]([a-z])*)*$ > > > > # RISC-V requires 'timebase-frequency' in /cpus, so disallow it here > > timebase-frequency: false >