On Wed, Apr 26, 2023 at 03:08:25PM +0200, Andrew Jones wrote: > On Wed, Apr 26, 2023 at 01:47:39PM +0100, Conor Dooley wrote: > > On Wed, Apr 26, 2023 at 02:18:52PM +0200, Andrew Jones wrote: > > > On Wed, Apr 26, 2023 at 11:43:24AM +0100, Conor Dooley wrote: > > > > Yangyu Chen reported that if an multi-letter extension begins with a > > > > capital letter the parser will treat the remainder of that multi-letter > > > > extension as single-letter extensions. > > > > > > I think the problem is that the parser doesn't completely abort when > > > it sees something it doesn't understand. Continuing is risky since > > > it may be possible to compose an invalid string that gets the parser > > > to run off the rails. > > > > Usually I am of the opinion that we should not seek the validate the dt > > in the kernel, since there are tools for doing so *cough* dt-validate > > *cough*. This one seemed like low hanging fruit though, since the parser > > handles having capital letters in any of the other places after the > > rv##, but falls over pretty badly for this particular issue. > > > > In general, I don't think we need to be concerned about anything that > > fails dt-validate though, you kinda need to trust that that is correct. > > I'd argue that we might even do too much validation in the parser at > > present. > > Is there some attack vector, or ACPI related consideration, that I am > > unaware of that makes this risky? A bit unrelated to this, but your mention of ACPI made me go look at the approved ECR[1] again for the ISA string. It says "Null-terminated ASCII Instruction Set Architecture (ISA) string for this hart. The format of the ISA string is defined in the RISC-V unprivileged specification." I suppose we can still add additional requirements to an ACPI ISA string which the Linux kernel will parse, but it'll be odd to point people at the DT binding to do that. Maybe we should consider making the parser more complete, possibly by importing it from some reference implementation or something. [1] https://drive.google.com/file/d/1nP3nFiH4jkPMp6COOxP6123DCZKR-tia/view Thanks, drew > > C language + string processing == potential attack vector > > > > > > How about completely aborting, noisily, when the string doesn't match > > > expectations, falling back to a default string such as rv64ima instead. > > > That also ought to get faster corrections of device trees. > > > > I did this first actually, but I was afraid that it would cause > > regressions? > > > > If you have riscv,isa = "rv64imafdc_Zifencei_zicbom", yes that is > > invalid and dt-validate would have told you so, but at present that > > would be parsed as "rv64imafdc_zicbom" which is a perfect description of > > the hardware in question (since the meaning of i was set before RVI made > > a hames of things). > > > > So that's why I opted to not do some sort of pr_err/BUG()/WARN() and > > try to keep processing the string. I'm happy to abort entirely on > > reaching a capital if people feel there's unlikely to be a fallout from > > that. > > There might be fallout, but the kernel needs to defend itself. IMO, if > the kernel doesn't know how to parse something, then it should stop > trying to immediately, either with a BUG(), refusing to accept any > part of it, by fallbacking back to a default, or by only accepting what > it believes it parsed correctly. > > The third option is probably a reasonable choice in this case. > > Thanks, > drew