On Thu, Apr 12, 2018 at 08:20:46AM -0600, David Brown wrote: > On Thu, Apr 12, 2018 at 02:44:20PM +1000, David Gibson wrote: > > > Back in the OF days there might have been more restrictions based on > > special characters in the Forth environment, to prevent paths with > > aliases being confused for something else. Not sure. > > Not sure how much IEE 1275 really matters these days, but it specifies > node names as: > > driver-name@unit-address:device-arguments > > with the driver name [a-zA-Z0-9,._+-]+ (the comma being a convention), Right. The dtc lexer definition of PROPNODECHAR was based on that description in §3.2.1.1 of 1275. And.. now that I come to look back at it, assuming the same set of chars for property and node names wasn't really correct. The set was expanded to include a few other things because they were present in existing device trees of the time, despite what IEE1275 said. > the address is "bus dependent", and the device arguments being all > printable characters other than "/", ":", and "@". Right, but we never use device arguments in flat trees, so they don't matter. > The "/" obviously being because it is the path separator. > > Alias name is any sequence of printable characters, other than "/", > "\", ":", "[", "]", and "@". > Property names do not allow upper-case characters, or "/", "\", ":", > "[", "]", and "@". Hrm. Which is a bit odd, since alias names should also be property names and obey all the same restrictions they do (no uppercase). dtc makes the restrictions for node, property and alias names identical. > It does specify a specific encoding of 8859-1, which is a bit annoying > in this Unicode world. Many bytes of UTF-8 would be considered > "non-printable" in 8859-1. Yeah, that's kinda crap. I think that's an argument for - whatever else - keeping these to 7-bit ASCII, so we don't have character set issues. > I think mainly the restricted characters would matter, for parsing > reasons (although the above suggests that "{" and "}" would be allowed > in an identifier, which, although allowed by FORTH, is not going to be > parsed that way by DTC). > > FORTH's rules were pretty simple, a word was a string of characters > separated by a space. There aren't really any restrictions on the > names, although names that look like numbers supersede that number, so > aren't really a good idea. Ah, ok. > The DTC lexer being quite different. It did actually derive from the same place, but yes has diverged a bit based partly on practicalities and partly on what was actually found in the wild. Without a compelling reason, I'm disinclined to widen the set of allowed characters. We can always widen, but if we do and it turns out to be problematic, going back could be very painful. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
Attachment:
signature.asc
Description: PGP signature