Re: [PATCH 1/4] virsh-domain: use correct base for virStrToLong_ui

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/23/2015 03:59 AM, Pavel Hrdina wrote:
On Fri, Oct 23, 2015 at 09:33:49AM +0200, Peter Krempa wrote:
On Thu, Oct 22, 2015 at 11:24:02 -0400, Laine Stump wrote:
On 10/21/2015 08:22 AM, Pavel Hrdina wrote:
While parsing device addresses we should use correct base and don't
count on auto-detect.  For example, PCI address uses hex numbers, but
each number starting with 0 will be auto-detected as octal number and
that's wrong.  Another wrong use-case is for PCI address if for example
bus is 10, than it's incorrectly parsed as decimal number.

PCI and CCW addresses have all values as hex numbers, IDE and SCSI
addresses are in decimal numbers.
I've seen examples for PCI that use decimal a number for the slot (which
is the one item that's likely to have a value > 7 unless you have a
system with a whole bunch of PCI controllers)[*], and those that use hex
numbers always prefix the number with "0x". libvirt itself always a
lspci doesn't really use 0x prefix:

My statement above was for any libvirt documentation, not "any program anywhere displaying a PCI address". The standard notation for a "unified PCI address" (coining my own term :-) is as they are displayed in lspci (and in all the entries in sysfs).



00:16.0 Communication controller: Intel Corporation Wildcat Point-LP MEI Controller #1 (rev 03)
00:19.0 Ethernet controller: Intel Corporation Ethernet Connection (3) I218-LM (rev 03)
00:1b.0 Audio device: Intel Corporation Wildcat Point-LP High Definition Audio Controller (rev 03)
00:1c.0 PCI bridge: Intel Corporation Wildcat Point-LP PCI Express Root Port #6 (rev e3)

Btw that's a laptop, not a super special system.

prefixes the domain, bus, and slot with 0x, so an auto-detected base
will always get those right.
Well, not entirely:

nodedev identificators use hex, possibly padded with zeroes:

pci_0000_00_03_0
pci_0000_00_14_0
pci_0000_00_16_0
pci_0000_00_19_0
pci_0000_00_1b_0

Which may result also in some octal 'fun' on boxes with 16 buses ;).

Sigh. Yes, I was only considering the <address> element, not the stuff in node device XML and virsh nodedev-* output :-(. What you have pointed out makes me even more wary of changing libvirt's parsing behavior.




nodedev XML uses decimal in the parsed address:

$ virsh nodedev-dumpxml pci_0000_00_19_0
<device>
   <name>pci_0000_00_19_0</name>
   <path>/sys/devices/pci0000:00/0000:00:19.0</path>
   <parent>computer</parent>
   <driver>
     <name>e1000e</name>
   </driver>
   <capability type='pci'>
     <domain>0</domain>
     <bus>0</bus>
     <slot>25</slot>
     <function>0</function>
     <product id='0x15a2'>Ethernet Connection (3) I218-LM</product>
     <vendor id='0x8086'>Intel Corporation</vendor>
   </capability>
</device>

(Uhhh, so 19 ... or 25? ... or perhaps 31?)


And for hostdevs we are promoting the use of 0x prefixed hex:

     <address domain='0x0000' bus='0x06' slot='0x02' function='0x0'/>

As well as with target addresses. Fortunately.

<sarcasm>Well that's very consistent.</sarcasm>


Yes, the inconsistency is actually part of my point. If we silently change the parsing to now recognize "25" (no "0x" prefix) as a hex number, it's going to give the wrong results for any XML written using a decimal slot# that worked properly on "old" libvirt, and it's going to make everything confusing in a different way since everyone will need to keep straight which version of libvirt interprets it which way - you'll *never* be able to write examples for documentation that omit the "0x" anyway, since you can't be sure the reader of the documentation has a new enough version of libvirt to force parsing as hex.

For example, if someone has a script that uses XML they created "a long time ago" by copy/pasting the slot# from the nodedev-dumpxml output (or maybe just hand-writing it) into XML that they use for "virsh create" or "virsh attach-device" (i.e. the XML is stored/generated external to libvirt in its original form, *not* normalized by the libvirt parse/format cycle), and upgraded libvirt will magically/silently begin interpreting any slot# > 9 differently, leading to anything from a different layout of devices on the guest PCI bus, to failure to attach a device because it wasn't found (or even worse, silently/erroneously attaching the *wrong* host device to the guest). I don't think that is acceptable.


If we want to get rid of the inconsistency, without creating the possibility of *silently* performing incorrect operations, we should:

* require the input in the <address> element to have a "0x" prefix to be sure that there is no assumption on the part of the user that they are entering a decimal number.

* change the node device XML (nodedev-dumpxml) to output the domain, bus, and slot elements as hex, including the 0x (function is irrelevant, since by definition it can never be > 7)

Even doing that would carry the danger of causing existing functioning systems to fail though (although they would at least fail in a vocal manner).

(BTW, I looked for any places in libvirt that currently parse a number *in config* as hex without the 0x prefix - the only one I can find is this:

  * <address type='spapr-vio' reg='0x3000'/> <- reg is forced base16

(and a deprecated "devaddr" in the status (so probably unused in many years) that parses a "unified" address). Anything else that is read from config is either forced to base 10, or is base 0. So there isn't much precedence for interpreting numbers in libvirt config with no "0x" prefix as hex.)


So I I think the existing code is correct, and don't see an upside to
making this restriction/invisible change in semantics, and it could
potentially lead to incorrect results if someone is thinking that
they're entering decimal numbers.
The existing code is correct only perhaps from a historical point of
view. From a functional it's more than flawed. ...

[*] There was one particular document that even went to the trouble of
explaining how to convert the hex value of slot into a decimal number to
use in the libvirt config!
Well, this hints that we do actually something wrong here. The ambiguity
in parsing of numbers with virStrToLong_ui has already bitten us (I
won't bother looking up the commit though).

It's not ambiguous to the parser. Any given string is parsed to exactly one result. The problem is that the users expect the parser to behave differently than it does. Well, *some* of the users do. Maybe even *most* of them. But not *all* of them :-).

As for previous problems with differing bases, this reminded me of a very troublesome problem that I spent the time to look up just to fill the blank in my memory - commit 8efebd176. This is a case where libvirt was formatting the bus/device numbers of a USB device on the qemu commandline with %.3d (forcing leading 0's) and qemu was then interpreting the numbers as octal, so silently/magically attaching the wrong USB device. This problem wasn't solved by making qemu always parse USB bus/device as decimal though; it was instead solved by making libvirt aware of / follow the rules of qemu's parser. (I realize it's a bit more difficult to "patch the user", but that is also an option :-)



The problem is that if you use 0 as a base argument it's not really
clear what you've parsed and that will result in situations like this.
The developer may be lucky in trying numbers that were parsed correctly
misleading him that he used the correct base.

I think that we shouldn't be using 0 as base in ANY place in our code.

I think that it is dangerous to have a mix of base 16 and base 10 numbers in a file while not requiring a clear indicator on every number of what base was used to encode them. Sure there are some numbers that are obviously hex (an address in memory) and there are some that are obviously decimal (timeout in seconds), but there are some where it's just not clear to the user. The case of PCI addresses is one of those (mostly due to the node device XML. But simply changing that won't eliminate all the existing examples out on the internet (nor will it immediately upgrade all the existing installations of libvirt).


(BTW, I also think there is absolutely *no* place for octal representations of numbers anywhere in anything libvirt does. This is not 1975, and we're not all (waiting in line for our turn at) sitting in front of a TTY connected to a PDP-11 :-P)



As of this particular case: If we format it in base 16, we should parse
it in base 16. Having ambiguity in the parser code will only end up in
problems.

Okay, then to take that to its logical conclusion - if we format it with a leading 0x, then we should require the leading 0x when parsing.



Signed-off-by: Pavel Hrdina <phrdina@xxxxxxxxxx>
---
   tools/virsh-domain.c | 30 +++++++++++++++---------------
   1 file changed, 15 insertions(+), 15 deletions(-)

Peter
A agree with Peter, as there is no reasonable point to represent PCI address in
decimal numbers, only libvirt does that and as said, it's wrong.  I don't know
any other place, where the PCI address is printed as decimal number.

I've been motivated to update this behavior after I've used '0000:05:10.1' as
an argument for the '--srouce' and it parsed the slot as decimal number and
printed in to the XML as '0x0a'.

Yeah I can understand that happening (and your annoyance when it did), and if all pci address info everywhere in libvirt and its documentation was (and always had been) output in hex (*without* the leading "0x") I would probably agree with this patch. But due to the history (and current state) I think it's just going to create as many problems as it solves. I *might* agree with it if it was done by requiring a leading '0x' (having the effect of causing failure for any existing XML that had been using decimal - I'm not 100% sure that would be acceptable either, but at least it is a *visible* failure, rather than a silent one), and I would be more prone to agree if the node device XML was modified to output the bus/slot/port in hex (with leading 0x - again, I can't say for sure that even that wouldn't screw up some existing management application though, so I'm *still* hesitant).



--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list



[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]