Brad King <brad.king@xxxxxxxxxxx> writes: > Nothing else uses LF NUL. I chose it as a starting point for > this very discussion, which I asked about in $gmane/233653. The primary reason why LF raised my eyebrow was because the reason why many subcommands use "-z" (and NUL) is often because the payload may have LF in a record and LF cannot be used as a record separator without escaping. And they use NUL knowing that the payload data in fields cannot contain a NUL. If we used LF as a signal to define the structure of the record, it pretty much defeats the whole point of defining "-z" format. The -m reason string will be made into a single liner deep in the codepath but it _can_ contain LF. I would have been more receptive to, say, double-NUL as a record terminator, while using a NUL as a field terminator, or something, but then we would need to have a way to express an empty field. > In this particular use case we know the last field will never > be LF but that may not be so for future cases. There is no way > to represent sentinel-terminated arbitrary variable-width records > of NUL-terminated fields without some kind of escaping for the > sentinel value, but the whole point of -z is to avoid escaping. Indeed, but one escape hatch we have is that payload will not contain NUL anywhere, so whenever we see a NUL, we can trust that it defines the structure of the record, and is not a part of the payload. Stepping back a bit, here are some observations on the arguments update-ref can take: * "-m <reason>" is a reason given for this entire update. As the point of this new feature is to give an all-or-none update to one or more refs, I think we should not have to accept more than one reason (more specifically, the -m option does _not_ belong to a specific record that describes what happens to a single ref). * "-d <ref> <oldvalue>" is a way to delete a ref. <oldvalue> may be missing. * "--no-deref <ref> <newvalue> <oldvalue>" and "<ref> <newvalue> <oldvalue>" are ways to update or create a ref. Again <oldvalue> may be missing. So it looks to me that one possible format that is easy to generate by machine without ambiguity may be: * The first record could be m NUL <reason strong> NUL but it is optional. The reason string may contain LF but just like invocation from the command line, LF will eventually cleaned up into a SP. * Then a series of records of different kinds follow. - A delete record looks like this: d NUL <ref> NUL <oldvalue> NUL If you want to delete the ref without "oldvalue" protection, just say d NUL <ref> NUL NUL - A create/update record looks like one of these: u NUL <ref> NUL <newvalue> NUL <oldvalue> NUL n NUL <ref> NUL <newvalue> NUL <oldvalue> NUL Again, if you want to delete the ref without "oldvalue" protection, just say u NUL <ref> NUL <newvalue> NUL NUL n NUL <ref> NUL <newvalue> NUL NUL * EOF signals the end of the request. I am not saying the above is the best format, but the point is that the mode of the operation defines the structure, so unlike parsing xml or json where you first parse the structure and then interpret what each element means, you can define a simple format where the kind of element comes upfront to allow the parser/interpreter know what is expected to follow it. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html