Re: [PATCH v9 22/22] s390: doc: detailed specifications for AP virtualization

Harald Freudenberger <freude@xxxxxxxxxxxxx> · Tue, 21 Aug 2018 11:00:00 +0200



On 20.08.2018 18:03, Cornelia Huck wrote:
> On Mon, 13 Aug 2018 17:48:19 -0400
> Tony Krowiak <akrowiak@xxxxxxxxxxxxxxxxxx> wrote:
>
>> From: Tony Krowiak <akrowiak@xxxxxxxxxxxxx>
>>
>> This patch provides documentation describing the AP architecture and
>> design concepts behind the virtualization of AP devices. It also
>> includes an example of how to configure AP devices for exclusive
>> use of KVM guests.
>>
>> Signed-off-by: Tony Krowiak <akrowiak@xxxxxxxxxxxxx>
>> Reviewed-by: Halil Pasic <pasic@xxxxxxxxxxxxx>
>> Signed-off-by: Christian Borntraeger <borntraeger@xxxxxxxxxx>
>> ---
>>  Documentation/s390/vfio-ap.txt |  615 ++++++++++++++++++++++++++++++++++++++++
>>  MAINTAINERS                    |    1 +
>>  2 files changed, 616 insertions(+), 0 deletions(-)
>>  create mode 100644 Documentation/s390/vfio-ap.txt
>>
>> +AP Architectural Overview:
>> +=========================
>> +To facilitate the comprehension of the design, let's start with some
>> +definitions:
>> +
>> +* AP adapter
>> +
>> +  An AP adapter is an IBM Z adapter card that can perform cryptographic
>> +  functions. There can be from 0 to 256 adapters assigned to an LPAR. Adapters
>> +  assigned to the LPAR in which a linux host is running will be available to
>> +  the linux host. Each adapter is identified by a number from 0 to 255. When
>> +  installed, an AP adapter is accessed by AP instructions executed by any CPU.
>> +
>> +  The AP adapter cards are assigned to a given LPAR via the system's Activation
>> +  Profile which can be edited via the HMC. When the system is IPL'd, the AP bus
> There's lots of s390 jargon in here... but one hopes that someone
> trying to understand AP is already familiar with the basics...
>
>> +  module is loaded and detects the AP adapter cards assigned to the LPAR. The AP
>> +  bus creates a sysfs device for each adapter as they are detected. For example,
>> +  if AP adapters 4 and 10 (0x0a) are assigned to the LPAR, the AP bus will
>> +  create the following sysfs entries:
>> +
>> +    /sys/devices/ap/card04
>> +    /sys/devices/ap/card0a
>> +
>> +  Symbolic links to these devices will also be created in the AP bus devices
>> +  sub-directory:
>> +
>> +    /sys/bus/ap/devices/[card04]
>> +    /sys/bus/ap/devices/[card04]
>> +
>> +* AP domain
>> +
>> +  An adapter is partitioned into domains. Each domain can be thought of as
>> +  a set of hardware registers for processing AP instructions. An adapter can
>> +  hold up to 256 domains. Each domain is identified by a number from 0 to 255.
>> +  Domains can be further classified into two types:
>> +
>> +    * Usage domains are domains that can be accessed directly to process AP
>> +      commands.
>> +
>> +    * Control domains are domains that are accessed indirectly by AP
>> +      commands sent to a usage domain to control or change the domain; for
>> +      example, to set a secure private key for the domain.
>> +
>> +  The AP usage and control domains are assigned to a given LPAR via the system's
>> +  Activation Profile which can be edited via the HMC. When the system is IPL'd,
>> +  the AP bus module is loaded and detects the AP usage and control domains
>> +  assigned to the LPAR. The domain number of each usage domain will be coupled
>> +  with the adapter number of each AP adapter assigned to the LPAR to identify
>> +  the AP queues (see AP Queue section below). The domain number of each control
>> +  domain will be represented in a bitmask and stored in a sysfs file
>> +  /sys/bus/ap/ap_control_domain_mask created by the bus. The bits in the mask,
>> +  from most to least significant bit, correspond to domains 0-255.
>> +
>> +  A domain may be assigned to a system as both a usage and control domain, or
>> +  as a control domain only. Consequently, all domains assigned as both a usage
>> +  and control domain can both process AP commands as well as be changed by an AP
>> +  command sent to any usage domain assigned to the same system. Domains assigned
>> +  only as control domains can not process AP commands but can be changed by AP
>> +  commands sent to any usage domain assigned to the system.
> I'm struggling a bit with this paragraph. Does that mean that you can
> use control domains as the target of an instruction changing
> configuration on the system? (Or on the VM, if they are listed in the
> relevant control block?)
Yes. You can send an CPRB to a (usage) domain which includes
a command for controlling another (control) domain.
>
>> +
>> +* AP Queue
>> +
>> +  An AP queue is the means by which an AP command-request message is sent to a
>> +  usage domain inside a specific adapter. An AP queue is identified by a tuple
>> +  comprised of an AP adapter ID (APID) and an AP queue index (APQI). The
>> +  APQI corresponds to a given usage domain number within the adapter. This tuple
>> +  forms an AP Queue Number (APQN) uniquely identifying an AP queue. AP
>> +  instructions include a field containing the APQN to identify the AP queue to
>> +  which the AP command-request message is to be sent for processing.
>> +
>> +  The AP bus will create a sysfs device for each APQN that can be derived from
>> +  the cross product of the AP adapter and usage domain numbers detected when the
>> +  AP bus module is loaded. For example, if adapters 4 and 10 (0x0a) and usage
>> +  domains 6 and 71 (0x47) are assigned to the LPAR, the AP bus will create the
>> +  following sysfs entries:
>> +
>> +    /sys/devices/ap/card04/04.0006
>> +    /sys/devices/ap/card04/04.0047
>> +    /sys/devices/ap/card0a/0a.0006
>> +    /sys/devices/ap/card0a/0a.0047
>> +
>> +  The following symbolic links to these devices will be created in the AP bus
>> +  devices subdirectory:
>> +
>> +    /sys/bus/ap/devices/[04.0006]
>> +    /sys/bus/ap/devices/[04.0047]
>> +    /sys/bus/ap/devices/[0a.0006]
>> +    /sys/bus/ap/devices/[0a.0047]
>> +
>> +* AP Instructions:
>> +
>> +  There are three AP instructions:
>> +
>> +  * NQAP: to enqueue an AP command-request message to a queue
>> +  * DQAP: to dequeue an AP command-reply message from a queue
>> +  * PQAP: to administer the queues
> So, NQAP/DQAP need usage domains, while PQAP needs a control domain? Or
> is it that all of them need usage domains, but PQAP can target a control
> domain as well?
>
> [I don't want to dive deeply into the AP architecture here, just far
> enough to really understand the design implications.]
Well, to be honest, nobody ever tried this under Linux. Theoretically
one should be able to send a CPRB to a usage domain where inside
the CPRB another domain (the control domain) is addressed. However,
as of now I am only aware of applications controlling the same usage
domain. I don't know any application which is able to address another
control domain and I am not sure if the zcrypt device driver would
handle such a CPRB correctly. NQAP, DQAP and PQAP always address
a usage domain. But the CPRB send down the pipe via NQAP may
address some control thing on another domain. I am not sure which
code and where do the sorting out here. There are two candidates:
the firmware layer in the CEC and the crypto card code.
>
>> +
>> +AP and SIE:
>> +==========
>> +Let's now take a look at how AP instructions executed on a guest are interpreted
>> +by the hardware.
>> +
>> +A satellite control block called the Crypto Control Block (CRYCB) is attached to
>> +our main hardware virtualization control block. The CRYCB contains three fields
>> +to identify the adapters, usage domains and control domains assigned to the KVM
>> +guest:
>> +
>> +* The AP Mask (APM) field is a bit mask that identifies the AP adapters assigned
>> +  to the KVM guest. Each bit in the mask, from most significant to least
>> +  significant bit, corresponds to an APID from 0-255. If a bit is set, the
>> +  corresponding adapter is valid for use by the KVM guest.
>> +
>> +* The AP Queue Mask (AQM) field is a bit mask identifying the AP usage domains
>> +  assigned to the KVM guest. Each bit in the mask, from most significant to
>> +  least significant bit, corresponds to an AP queue index (APQI) from 0-255. If
>> +  a bit is set, the corresponding queue is valid for use by the KVM guest.
>> +
>> +* The AP Domain Mask field is a bit mask that identifies the AP control domains
>> +  assigned to the KVM guest. The ADM bit mask controls which domains can be
>> +  changed by an AP command-request message sent to a usage domain from the
>> +  guest. Each bit in the mask, from least significant to most significant bit,
>> +  corresponds to a domain from 0-255. If a bit is set, the corresponding domain
>> +  can be modified by an AP command-request message sent to a usage domain
>> +  configured for the KVM guest.
> OK, that seems to imply that you modify a control domain by sending a
> request to (any) usage domain? I do not doubt that, but the whole
> architecture is really confusing :)
>
>> +
>> +If you recall from the description of an AP Queue, AP instructions include
>> +an APQN to identify the AP adapter and AP queue to which an AP command-request
>> +message is to be sent (NQAP and PQAP instructions), or from which a
>> +command-reply message is to be received (DQAP instruction). The validity of an
>> +APQN is defined by the matrix calculated from the APM and AQM; it is the
>> +cross product of all assigned adapter numbers (APM) with all assigned queue
>> +indexes (AQM). For example, if adapters 1 and 2 and usage domains 5 and 6 are
>> +assigned to a guest, the APQNs (1,5), (1,6), (2,5) and (2,6) will be valid for
>> +the guest.
> How does the control domain mask interact with that? Can you send a
> command to an APQN valid for the guest to modify any control domain
> specified in the mask? Does the SIE complain if you specify a control
> domain that the host does not have access to (I'd guess so)?
>
>> +
>> +The APQNs can provide secure key functionality - i.e., a private key is stored
>> +on the adapter card for each of its domains - so each APQN must be assigned to
>> +at most one guest or to the linux host.
>> +
>> +   Example 1: Valid configuration:
>> +   ------------------------------
>> +   Guest1: adapters 1,2  domains 5,6
>> +   Guest2: adapter  1,2  domain 7
>> +
>> +   This is valid because both guests have a unique set of APQNs: Guest1 has
>> +   APQNs (1,5), (1,6), (2,5) and (2,6); Guest2 has APQNs (1,7) and (2,7).
>> +
>> +   Example 2: Invalid configuration:
>> +   Guest1: adapters 1,2  domains 5,6
>> +   Guest2: adapter  1    domains 6,7
>> +
>> +   This is an invalid configuration because both guests have access to
>> +   APQN (1,6).
> So, the adapters or the domains can overlap , but the cross product
> mustn't? If I had
>
> Guest1: adapters 1,2 domains 5,6
> Guest2: adapters 3,4 domains 5,6
>
> would that be fine?
>
> Is there any rule about shared control domains?
>
> (...)
>
>> +Limitations
>> +===========
>> +* The KVM/kernel interfaces do not provide a way to prevent unbinding an AP
>> +  queue that is still assigned to a mediated device. Even if the device
>> +  'remove' callback returns an error, the device core detaches the AP
>> +  queue from the VFIO AP driver. It is therefore incumbent upon the
>> +  administrator to make sure there is no mediated device to which the
>> +  APQN - for the AP queue being unbound - is assigned.
>> +
>> +* Hot plug/unplug of AP devices is not supported for guests.
> Not sure what that sentence means. Adding/removing devices by the
> hypervisor is not supported? Or some guest actions, respectively
> injecting notifications that would trigger some actions on the real
> hardware?
>
> Do you want to add (some of) this in the future?
>
>> +
>> +* Live guest migration is not supported for guests using AP devices.
> Migration and vfio is an interesting area in general :) Would be great
> if vfio-ap could benefit from any generic efforts in that area, but
> that probably requires that someone with access to documentation and
> hardware keeps an eye on developments.
>
>> \ No newline at end of file
> Please add one :)
>