On 2024-09-25 at 18:45:54, Sean Allred wrote: > "brian m. carlson" <sandals@xxxxxxxxxxxxxxxxxxxx> writes: > > > On 2024-09-24 at 21:52:35, Ron Ziroby Romero wrote: > >> What do y'all think? > > > > I think this is ultimately a bad idea. JSON requires that the output be > > UTF-8, but Git processes a large amount of data, including file names, > > ref names, commit messages, author and committer identities, diff > > output, and other file contents, that are not restricted to UTF-8. > > This strikes me with a little bit of 'perfect as the enemy of good' > here. I'm sure there are ways to signal an encoding failure. I would, > however, caution against trying to provide diff output in JSON. That > just seems... odd. Maybe base64 it first? (I don't know -- I just > struggle to see the use-case here.) I understand JSON output would be useful, but it's also not useful to randomly fail to do git for-each-ref (for example) because someone has a non-UTF-8 ref, or to fail to do a git log because of encoding problems (which absolutely is a problem in the Linux kernel tree). "It works most of the time, but seemingly randomly fails" is not a good user experience, and I'm opposed to adding serialization formats that do that. (For that reason, just-send-bytes that produces invalid JSON on occasion is also unacceptable.) If we always base64-encoded or percent-encoded the things that aren't guaranteed to be UTF-8, then we could well create JSON. However, that makes working with the data structure in most scripting languages a pain since there's no automatic decoding of this data. In strongly typed languages like Rust, it's possible to do this decoding with no problem, but I expect that's not most users who'd want this feature. -- brian m. carlson (they/them or he/him) Toronto, Ontario, CA
Attachment:
signature.asc
Description: PGP signature