Eric Blake <eblake@xxxxxxxxxx> writes: > On 5/7/19 4:39 AM, Daniel P. Berrangé wrote: > >>> JSON is terrible at interoperability, so good luck with that. >>> >>> If you reduce your order to "the commonly used JSON libraries we know", >>> we can talk. >> >> I don't particularly want us to rely on semantics of small known set >> of JSON libs. I really do want us to do something that is capable of >> working with any JSON impl that exists in any programming language. >> >> My suggested option 2 & 3 at least would manage that I believe, as >> any credible JSON impl will be able to represent 32-bit integers >> or strings without loosing data. >> >> Option 1 would not cope as some impls can't even cope with >> signed 64-bit ints. >> >>>>>> I can think of some options: >>>>>> >>>>>> 1. Encode unsigned 64-bit integers as signed 64-bit integers. >>>>>> >>>>>> This follows the example that most C libraries map JSON ints >>>>>> to 'long long int'. This is still relying on undefined >>>>>> behaviour as apps don't need to support > 2^53-1. >>>>>> >>>>>> Apps would need to cast back to 'unsigned long long' for >>>>>> those QMP fields they know are supposed to be unsigned. >>> >>> Ugly. It's also what we did until v2.10, August 2017. QMP's input >>> direction still does it, for backward compatibility. > > Having qemu accept signed ints in place of large unsigned values is easy > enough. But you are right that it loses precision when doubles are > involved on the receiving end, and we cross the 2^53 barrier. > >>> >>>>>> >>>>>> >>>>>> 2. Encode all 64-bit integers as a pair of 32-bit integers. >>>>>> >>>>>> This is fully compliant with the JSON spec as each half >>>>>> is fully within the declared limits. App has to split or >>>>>> assemble the 2 pieces from/to a signed/unsigned 64-bit >>>>>> int as needed. >>> >>> Differently ugly. > > Particularly ugly as we turn 1<<55 from: > > "value":36028797018963968 > > into > > "value":[8388608,0] > > and now both qemu and the client end have to agree that an array of two > integers is a valid replacement for any larger 64-bit quantity > (presumably, we'd always accept the array form even for small integer > values, but only produce the array form for large values). And while it > manages just fine for uint64_t values, what rules would you place on > int64_t values? That the resulting 2-integer array is combined with the > first number as a 2's-complement signed value, and the second being a > 32-bit unsigned value? There's more than one way to encode integers as a list of 53 bit signed integers. Any of them will do, we just have to specify one. >>>>>> >>>>>> >>>>>> 3. Encode all 64-bit integers as strings >>>>>> >>>>>> The application has todo all parsing/formatting client >>>>>> side. >>> >>> Yet another ugly. > > But less so than option 2. > > "value":36028797018963968 > > vs. > > "value":"36028797018963968" > > is at least tolerable. Yes. >>>>>> None of these changes are backwards compatible, so I doubt we could make >>>>>> the change transparently in QMP. Instead we would have to have a >>>>>> QMP greeting message capability where the client can request enablement >>>>>> of the enhanced integer handling. >>> >>> We might be able to do option 1 without capability negotiation. v2.10's >>> change from option 1 to what we have now produced zero complaints. >>> >>> On the other hand, we made that change for a reason, so we may want a >>> "send large integers as negative integers" capability regardless. >>> >>>>>> >>>>>> Any of the three options above would likely work for libvirt, but I >>>>>> would have a slight preference for either 2 or 3, so that we become >>>>>> 100% standards compliant. > > If we're going to negotiate something, I'd lean towards option 3 > (anywhere the introspection states that we accept 'int64' or similar, it > is also appropriate to send a string value in its place). We'd also have > to decide if we want to allow "0xabcd", or strictly insist on 43981, > when stringizing an integer. And while qemu should accept a string or a > number on input, we'd still have to decide/document whether it's > response to the client capability negotiation is to output a string > always, or only for values larger than the 2^53 threshold. Picking option 3 is no excuse for complicating matters further. QMP is primarily for machines. So my first choice would be to keep everything decimal. I could be persuaded to have QEMU parse integers from strings with base 0, i.e. leading 0x gets you hex, leading 0 gets you octal. >>> >>> There's no such thing. You mean "we maximize interoperability with >>> common implementations of JSON". >> >> s/common/any/ >> >>> Let's talk implementation for a bit. >>> >>> Encoding and decoding integers in funny ways should be fairly easy in >>> the QObject visitors. The generated QMP marshallers all use them. >>> Trouble is a few commands still bypass the generated marshallers, and >>> mess with the QObject themselves: >>> >>> * query-qmp-schema: minor hack explained in qmp_query_qmp_schema()'s >>> comment. Should be harmless. >>> >>> * netdev_add: not QAPIfied. Eric's patches to QAPIfy it got stuck >>> because they reject some abuses like passing numbers and bools as >>> strings. >>> >>> * device_add: not QAPIfied. We're not sure QAPIfication is feasible. >>> >>> netdev_add and device_add both use qemu_opts_from_qdict(). Perhaps we >>> could hack that to mirror what the QObject visitor do. >>> >>> Else, we might have to do it in the JSON parser. Should be possible, >>> but I'd rather not. >>> >>>>> My preference would be 3 with the strings defined as being >>>>> %x lower case hex formated with a 0x prefix and no longer than 18 characters >>>>> ("0x" + 16 nybbles). Zero padding allowed but not required. >>>>> It's readable and unambiguous when dealing with addresses; I don't want >>>>> to have to start decoding (2) by hand when debugging. >>>> >>>> Yep, that's a good point about readability. >>> >>> QMP sending all integers in decimal is inconvenient for some values, >>> such as addresses. QMP sending all (large) integers in hexadecimal >>> would be inconvenient for other values. >>> >>> Let's keep it simple & stupid. If you want sophistication, JSON is the >>> wrong choice. > > JSON requires decimal-only, but I'm okay if we state that when > negotiating the alternative representation, that we output hex-only. > (JSON5 adds hex support among other things, but it is not an RFC > standard, and even fewer libraries exist that parse JSON5 in addition to > straight JSON). > >>> >>> >>> Option 1 feels simplest. >> >> But will still fail with any JSON impl that uses double precision floating >> point for integers as it will loose precision. >> >>> Option 2 feels ugliest. Less simple, more interoperable than option 1. >> >> If we assume any JSON impl can do 32-bit integers without loss of >> precision, then I think we can say it is guaranteed portable, but >> it is certainly horrible / ugly. >> >>> Option 3 is like option 2, just not quite as ugly. >> >> I think option 3 can be guaranteed to be loss-less with /any/ JSON impl >> that exists, since you're delegating all string -> int conversion to >> the application code taking the JSON parser/formatter out of the equation. >> >> This is close to the approach libvirt takes with YAJL parser today. YAJL >> parses as a int64 and we then ignore its result, and re-parse the string >> again in libvirt as uint64. When generating json we format as uint64 >> in libvirt and ignore YAJLs formatting for int64. >> >>> Can we agree to eliminate option 2 from the race? >> >> I'm fine with eliminating option 2. > > Same here. Noted. >> I guess I'd have a preference for option 3 given that it has better >> interoperability > > Likewise - if we're going to bother with a capability that changes > output and allows the input validators to accept more forms, I'd prefer > a string form with correct sign over a negative integer that depends on > 64-bit 2's-complement arithmetic to intepret correctly. Noted. -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list