On Wed, Apr 26, 2023 at 02:58:45PM +0100, Conor Dooley wrote: > On Wed, Apr 26, 2023 at 03:08:25PM +0200, Andrew Jones wrote: > > On Wed, Apr 26, 2023 at 01:47:39PM +0100, Conor Dooley wrote: > > > On Wed, Apr 26, 2023 at 02:18:52PM +0200, Andrew Jones wrote: > > > > On Wed, Apr 26, 2023 at 11:43:24AM +0100, Conor Dooley wrote: > > > > > Yangyu Chen reported that if an multi-letter extension begins with a > > > > > capital letter the parser will treat the remainder of that multi-letter > > > > > extension as single-letter extensions. > > > > > > > > I think the problem is that the parser doesn't completely abort when > > > > it sees something it doesn't understand. Continuing is risky since > > > > it may be possible to compose an invalid string that gets the parser > > > > to run off the rails. > > > > > > Usually I am of the opinion that we should not seek the validate the dt > > > in the kernel, since there are tools for doing so *cough* dt-validate > > > *cough*. This one seemed like low hanging fruit though, since the parser > > > handles having capital letters in any of the other places after the > > > rv##, but falls over pretty badly for this particular issue. > > > > > > In general, I don't think we need to be concerned about anything that > > > fails dt-validate though, you kinda need to trust that that is correct. > > > I'd argue that we might even do too much validation in the parser at > > > present. > > > Is there some attack vector, or ACPI related consideration, that I am > > > unaware of that makes this risky? > > > > C language + string processing == potential attack vector > > Right. I thought there was some specific scenario that you had in mind, > rather than the "obvious" "parsing strings is bad". Yup, I only pointed out the obvious since it's always possible, at least for me, to lose sight of the forest for the trees. > What I was wondering is whether the devicetree is an attack vector you > actually have to care about? I had thought it wasn't, hence asking. Nope, I haven't put any thought into this potential attack vector beyond this discussion. > > > > > How about completely aborting, noisily, when the string doesn't match > > > > expectations, falling back to a default string such as rv64ima instead. > > > > That also ought to get faster corrections of device trees. > > > > > > I did this first actually, but I was afraid that it would cause > > > regressions? > > > > > > If you have riscv,isa = "rv64imafdc_Zifencei_zicbom", yes that is > > > invalid and dt-validate would have told you so, but at present that > > > would be parsed as "rv64imafdc_zicbom" which is a perfect description of > > > the hardware in question (since the meaning of i was set before RVI made > > > a hames of things). > > After thinking about it a bit cycling home, I don't actually think that > this would be a regression. If your dt is not valid, then that's your > problem, not ours :) > Valid dt will be parsed correctly before and after such a change, so I > think that that is actually okay. > The dt-binding exists for a reason, and can be pointed to if anyone > claims this is a regression I think. I agree. > > > > So that's why I opted to not do some sort of pr_err/BUG()/WARN() and > > > try to keep processing the string. I'm happy to abort entirely on > > > reaching a capital if people feel there's unlikely to be a fallout from > > > that. > > > > There might be fallout, but the kernel needs to defend itself. IMO, if > > the kernel doesn't know how to parse something, then it should stop > > trying to immediately, either with a BUG(), refusing to accept any > > part of it, by fallbacking back to a default, or by only accepting what > > it believes it parsed correctly. > > > > The third option is probably a reasonable choice in this case. > > My patch could be interpreted as meeting the criteria for option 3. > I think you instead mean "stop parsing at that point & only report the > extensions seen prior to the first bad one"? Right. Thanks, drew