On Tue, Apr 30, 2019 at 03:45:46PM +0100, Dr. David Alan Gilbert wrote: > * Daniel P. Berrangé (berrange@xxxxxxxxxx) wrote: > > The QEMU QMP service is based on JSON which is nice because that is a > > widely supported "standard" data format..... > > > > ....except QEMU's implementation (and indeed most impls) are not strictly > > standards compliant. > > > > Specifically the problem is around representing 64-bit integers, whether > > signed or unsigned. > > > > The JSON standard declares that largest integer is 2^53-1 and the > > likewise the smallest is -(2^53-1): > > > > http://www.ecma-international.org/ecma-262/6.0/index.html#sec-number.max_safe_integer > > > > A crazy limit inherited from its javascript origins IIUC. > > Ewwww. Looking a bit deeper it seems this limit comes from the use of double precision floating point for storing integers. 2^53-1 is the largest integer value that can be stored in a 64-bit float without loss of precision. The Golang JSON parser decodes JSON numbers to float64 by default so will have this precision limitation too, though at least they provide a backdoor for custom parsing from the original serialized representation. > > QEMU, and indeed many applications, want to handle 64-bit integers. > > The C JSON library impls have traditionally mapped integers to the > > data type 'long long int' which gives a min/max of -(2^63) / 2^63-1. > > > > QEMU however /really/ needs 64-bit unsigned integers, ie a max 2^64-1. > > > > Libvirt has historically used the YAJL library which uses 'long long int' > > and thus can't officially go beyond 2^63-1 values. Fortunately it lets > > libvirt get at the raw json string, so libvirt can re-parse the value > > to get an 'unsigned long long'. > > > > We recently tried to switch to Jansson because YAJL has a dead upstream > > for many years and countless unanswered bugs & patches. Unfortunately we > > forgot about this need for 2^64-1 max, and Jansson also uses 'long long int' > > and raises a fatal parse error for unsigned 64-bit values above 2^63-1. It > > also provides no backdoor for libvirt todo its own integer parsing. Thus > > we had to abort our switch to jansson as it broke parsing QEMU's JSON: > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1614569 > > > > Other JSON libraries we've investigated have similar problems. I imagine > > the same may well be true of non-C based JOSN impls, though I've not > > investigated in any detail. > > > > Essentially libvirt is stuck with either using the dead YAJL library > > forever, or writing its own JSON parser (most likely copying QEMU's > > JSON code into libvirt's git). > > > > This feels like a very unappealing situation to be in as not being > > able to use a JSON library of our choice is loosing one of the key > > benefits of using a standard data format. > > > > Thus I'd like to see a solution to this to allow QMP to be reliably > > consumed by any JSON library that exists. > > > > I can think of some options: > > > > 1. Encode unsigned 64-bit integers as signed 64-bit integers. > > > > This follows the example that most C libraries map JSON ints > > to 'long long int'. This is still relying on undefined > > behaviour as apps don't need to support > 2^53-1. > > > > Apps would need to cast back to 'unsigned long long' for > > those QMP fields they know are supposed to be unsigned. > > > > > > 2. Encode all 64-bit integers as a pair of 32-bit integers. > > > > This is fully compliant with the JSON spec as each half > > is fully within the declared limits. App has to split or > > assemble the 2 pieces from/to a signed/unsigned 64-bit > > int as needed. > > > > > > 3. Encode all 64-bit integers as strings > > > > The application has todo all parsing/formatting client > > side. > > > > > > None of these changes are backwards compatible, so I doubt we could make > > the change transparently in QMP. Instead we would have to have a > > QMP greeting message capability where the client can request enablement > > of the enhanced integer handling. > > > > Any of the three options above would likely work for libvirt, but I > > would have a slight preference for either 2 or 3, so that we become > > 100% standards compliant. > > My preference would be 3 with the strings defined as being > %x lower case hex formated with a 0x prefix and no longer than 18 characters > ("0x" + 16 nybbles). Zero padding allowed but not required. > It's readable and unambiguous when dealing with addresses; I don't want > to have to start decoding (2) by hand when debugging. Yep, that's a good point about readability. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list