Re: [PATCH libvirt v1 0/6] Fix ZPCI address auto-generation on s390

Boris Fiuczynski <fiuczy@xxxxxxxxxxxxx> · Thu, 14 May 2020 15:34:30 +0200

On 5/14/20 10:37 AM, Daniel P. Berrangé wrote:
On Wed, May 13, 2020 at 07:41:34PM +0200, Andrea Bolognani wrote:
On Wed, 2020-05-13 at 17:32 +0100, Daniel P. Berrangé wrote:
On Tue, May 12, 2020 at 12:13:22PM +0200, Boris Fiuczynski wrote:
The behavior change would be
Current code:
  uid=0 fid=0 -> uid=0 fid=0 -> address gets autogenerated
  uid=0 fid=x -> uid=0 fid=x -> address is rejected as invalid
  uid=0       -> uid=0 fid=0 -> address gets autogenerated

IIUC, in the two cases here where the address gets auto-generated,
the resulting guest VM successfully boots & runs....

With the series applied
  uid=0 fid=0 -> uid=0 fid=0 -> address is rejected as invalid
  uid=0 fid=x -> uid=0 fid=x -> address is rejected as invalid
  uid=0       -> uid=0 fid=0 -> address is rejected as invalid

...so this proposed change is a functional regression for the
user.

The documentation already specifies the uid value range correctly.
The fix for users hitting the two scenarios (uid=0 fid=0) and (uid=0) is
simple: Remove the zpci definition completely.

This would be taking a users' currently working VM, intentionally
breaking it, and then making the user pick up the pieces. This is
an example of a behaviour regression that libvirt promises to not
do to users.

The bit of nuance that might be missing here is that existing guests
already have a full zPCI address stored in the domain XML, which
means the wouldn't be affected in any way; additionally, the case
where no zPCI address is provided when defining a new guest, which I
assume is the most common one, will keep working.

The only scenarios that would no longer work are:

   * the user manually specifies uid=0 fid=0;
   * the user manually specifies uid=0 and doesn't specify fid.

In both cases the user would have gone out of their way to specify
a value for the uid attribute that is documented as being invalid:

   PCI addresses for S390 guests will have a zpci child element, with
   two attributes: uid (a hex value between 0x0001 and 0xffff [...]

   https://libvirt.org/formatdomain.html#elementsAddress

The effect of specifying zero though is that we perform allocation
to assign a non-zero address, which is then valid. The same happens
with regular PCI devices if you give slot="0".

As a result, they'd now get a pretty clear error message at define
time instead of confusing behavior across the board. I'm not really
sure anyone would complain about such a change.

I don't see this existing behaviour as confusing. It looks like mostly
being a docs ommission about auto-allocation taking place.

Maybe I am repeating myself but I find e.g the below example confusing 
if I take into consideration that uid=0 is NOT a valid value and fid is 
a valid value. Please note that the valid fid is dislocated from its 
original device!

Specify this in the domain:
   pcidev1: uid='0x0000' fid='0x00000000'
   pcidev2: uid='0x0000'
Results in a defined domain:
   pcidev1: uid='0x0002' fid='0x00000001'
   pcidev2: uid='0x0001' fid='0x00000000'

If the user would be tying to fix the dislocating fid... one would very 
likely try this:
Specify this in the domain:
   pcidev1: uid='0x0000' fid='0x00000000'
   pcidev2: uid='0x0000' fid='0x00000001'
Result:
error: Failed to define domain from mini-pcis.xml
error: XML error: Invalid PCI address uid='0x0000', must be > 0x0000 and 
<= 0xffff

Btw setting uid=0 is defined by architecture for a mode that we do not 
support in qemu AND setting fid=0 is an architectural valid assignment 
which in the example above is not granted to the device it was defined for.

Regards,
Daniel

--
Mit freundlichen Grüßen/Kind regards
   Boris Fiuczynski

IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294