Any takers on trashing my regex? Otherwise I'll just submit a v2 with the regex and it can be shat on there instead :) On 09/08/2022 19:36, Conor Dooley wrote: > On 09/08/2022 15:14, Rob Herring wrote: >> On Mon, Aug 08, 2022 at 10:01:11PM +0000, Conor.Dooley@xxxxxxxxxxxxx wrote: >>> On 08/08/2022 22:34, Jessica Clarke wrote: >>>> On Fri, Aug 05, 2022 at 05:28:42PM +0100, Conor Dooley wrote: >>>>> From: Conor Dooley <conor.dooley@xxxxxxxxxxxxx> >>>>> The final patch adds some new ISA strings >>>>> which needs scruitiny from someone with more knowledge about what ISA >>>>> extension strings should be reported in a dt than I have. >>>> >>>> Listing every possible ISA string supported by the Linux kernel really >>>> is not going to scale... >> >> How does the kernel scale? (No need to answer) >> >>> Yeah, totally correct there. Case for adding a regex I suppose, but I >>> am not sure how to go about handling the multi-letter extensions or >>> if parsing them is required from a binding compliance point of view. >>> Hoping for some input from Palmer really. >> >> Yeah, looks like a regex pattern is needed. > > I started pottering away at this but I have arrived at: > rv64imaf?d?c?h?(_z[imafdqcbvkh]([a-z])*)*$ > > I suspect that before "h?" there should be more single letter > extensions added for completeness sake. So then it'd bloat out to: > rv64imaf?d?q?c?b?v?k?h?(_z[imafdqcbvkh]([a-z])*)*$ > > I checked a couple different "bad" isa strings against it and > nothing went up in flames but my regex skills are far from great > so I'm sure there's better ways to represent this. > > Anyways, this pattern is based on my understanding that: > - the single letter order is fixed & we don't care about things that > can't even do "ima" > - the multi letter extensions are all in a "_z<foo>" format where the > first letter of <foo> is a valid single letter extension > - we don't care about the e extension from an OS PoV (this could be a > very flawed take...) > - after the first two chars, the extension name could be an english > word (ifencei anyone?) so it's not worth restricting the charset > - that attempting to validate the contents of the multiletter extensions > with dt-validate beyond the formatting is a futile, massively verbose > or unwieldy exercise at best > > Some or all of those assumptions could be very very wrong so if {someone, > anyone} wants to correct me - feel ***more*** than free.. > > Thanks, > Conor. > > patch would then look like: > > diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml b/Documentation/devicetree/bindings/riscv/cpus.yaml > index d632ac76532e..1e54e7746190 100644 > --- a/Documentation/devicetree/bindings/riscv/cpus.yaml > +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml > @@ -74,9 +74,7 @@ properties: > insensitive, letters in the riscv,isa string must be all > lowercase to simplify parsing. > $ref: "/schemas/types.yaml#/definitions/string" > - enum: > - - rv64imac > - - rv64imafdc > + pattern: rv64imaf?d?q?c?b?v?k?h?(_z[imafdqcbvkh]([a-z])*)*$ > > # RISC-V requires 'timebase-frequency' in /cpus, so disallow it here > timebase-frequency: false