On Thu, 26 Oct 2017 11:54:54 -0400 Tony Krowiak <akrowiak@xxxxxxxxxxxxxxxxxx> wrote: Cool, documentation! > Signed-off-by: Tony Krowiak <akrowiak@xxxxxxxxxxxxxxxxxx> > --- > docs/ap_matrix.txt | 529 ++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 files changed, 529 insertions(+), 0 deletions(-) > create mode 100644 docs/ap_matrix.txt > > diff --git a/docs/ap_matrix.txt b/docs/ap_matrix.txt > new file mode 100644 > index 0000000..ec7bd44 > --- /dev/null > +++ b/docs/ap_matrix.txt > @@ -0,0 +1,529 @@ > +Adjunct Processor (AP) Matrix Devices > +===================================== > + > +Contents: > +========= > +* Introduction > +* AP Architectural Overview > +* Start Interpretive Execution (SIE) Instruction > +* AP Matrix Configuration on Linux Host > +* AP Matrix Configuration for a Linux Guest > +* Starting a Linux Guest Configured with an AP Matrix > +* Example: Configure AP Matrices for Two Linux Guests > + > +Introduction: > +============ > +The IBM Adjunct Processor (AP) Cryptographic Facility is comprised > +of three AP instructions and from 1 to 256 PCIe cryptographic adapter cards. > +These AP devices provide cryptographic functions to all CPUs assigned to a > +linux system running in an IBM Z system LPAR. Before you start with the details: Give a very, very high level overview? Like: On s390x, crypto cards are exposed via the AP bus. This document describes how those cards can be made available to KVM guests via vfio. > + > +The intent of this document is to provide administrators with the basic > +knowledge needed to provide a linux guest with direct access to one or more AP > +adapters available to the host linux system using an AP matrix device > + > +AP Architectural Overview: > +========================= > +In order understand the terminology used in the rest of this document, let's > +start with some definitions: > + > +* AP adapter > + > + An AP adapter is a PCIe cryptographic adapter that can perform cryptographic > + functions. There can be from 0 to 256 AP adapters assigned to an LPAR. > + Each adapter is identified by a number from 0 to 255. When > + installed, an AP is accessed by AP instructions executed by any CPU. > + > +* AP domain > + > + An adapter is partitioned into domains. Each domain can be thought of as > + a set of hardware registers dedicated to an active LPAR. An adapter can hold > + up to 256 domains. Each domain is identified by a number from 0 to 255. > + Domains can be further classified into two types: > + > + * Usage domains are domains that can be accessed directly to process AP > + commands > + > + * Control domains are domains that are accessed indirectly by AP > + commands sent to a usage domain to control or change the domain, for > + example; to specify a private key that can be used by the domain to > + perform cryptographic functions. > + > +* AP Queue > + > + An AP queue is the means by which an AP command is sent to an > + AP usage domain inside a specific AP. An AP queue is identified by a tuple > + comprised of an AP adapter ID and a usage domain index. The index corresponds > + to a given usage domain within the adapter. This tuple forms an AP Queue > + Number (APQN). AP instructions specify an APQN to identify the AP Queue > + to which an AP command-request message is to be sent, or from which a > + command-reply message is to be received. An APQN is specified in this > + document with one of two formats: APQN (xx,yyyy) or simply xx.yyyy, where > + xx is an adapter number and yyyy is a domain number. Both numbers will be > + specified in hexidecimal format. > + > +* AP Instructions: > + > + There are three AP instructions: > + > + * NQAP: to enqueue an AP command-request message to an AP queue > + * DQAP: to dequeue an AP command-reply message from an AP queue > + * PQAP: to administer an AP queue > + > +Start Interpretive Execution (SIE) Instruction > +============================================== > +A linux guest running on an IBM Z system is started under KVM by executing the > +Start Interpretive Execution (SIE) instruction. The SIE state description is a > +control block that contains the state information for a KVM guest and is > +supplied as input to the SIE instruction. The SIE state description contains a > +field that references a Crypto Control Block (CRYCB) containing three > +fields to identify the AP adapters, usage domains and control domains assigned > +to the KVM guest: > + > +* The AP Mask (APM) field specifies the AP adapter numbers assigned to the > + KVM guest. The APM controls which adapters are valid for the KVM guest. > + > +* The AP Queue Mask (AQM) field specifies the AP usage domain numbers assigned > + to the KVM guest. The AQM controls which usage domains are valid for the > + KVM guest. > + > +* The AP Domain Mask field specifies the AP control domains assigned to the > + KVM guest. The ADM controls which control domains are valid for the > + KVM guest. > + > +These three fields comprise the AP matrix for the guest. The APQNs accessible > +to the guest is the intersection of all assigned adapter numbers (APM) and > +all assigned usage domain numbers (AQM). For example, if adapters 1 and 2 and > +usage domains 5 and 6 are assigned to a guest, the APQNs (1,5), (1,6), (2,5) and > +(2,6) will be valid for AP instructions executed on the guest. > + > +The SIE instruction is run in interpretive execution mode which means the > +AP instructions executed on the guest are interpreted by the hardware. This > +allows a guest direct access to the AP adapter cards. Since each domain within > +a given adapter holds the master key used in the cryptographic functions it > +supports, each APQN must be assigned to at most one guest. > + > + Example 1: Valid configuration for two guests: > + --------------------------------------------- > + Guest1: adapters 1,2 domains 5,6 > + Guest2: adapter 1,2 domain 7 > + > + This is valid because both guests have a unique set of APQNs: Guest1 has > + APQNs (1,5), (1,6), (2,5) and (2,6); Guest2 has APQN (1,7) and (2,7). There > + is not overlap. > + > + Example 2: Invalid configuration for two guests: > + ----------------------------------------------- > + Guest1: adapters 1,2 domains 5,6 > + Guest2: adapter 1 domains 6,7 > + > + This is an invalid configuration because both guests have access to > + APQNs (1,6). > + > +AP Matrix Configuration on Linux Host: > +===================================== > +A linux system is a guest of the LPAR in which it is running and has access to > +the AP resources configured for the LPAR. The LPAR's AP matrix is > +configured using the 'Customize/Delete Activation Profiles' dialog from the HMC. > +This dialog displays the activation profiles configured for the linux system. > +Selecting the specific activation profile to be edited and clicking the > +'Customize Profile' button will open the 'Customize Image Profiles' dialog. > +Selecting the 'Crypto' link in the tree view on the left hand side of the dialog > +will display the AP matrix configuration in the right hand panel. There, one can > +assign AP adapters - called Cryptos - and domains to the LPAR. When the linux > +system is started using this activation profile, it will have access to the > +AP matrix configured via the activation profile. > + > +When the linux system is started, the AP adapter devices will be connected to > +the AP bus and the following AP matrix interfaces will be created in sysfs: > + > +/sys/bus/ap > +... [devices] > +...... xx.yyyy > +...... ... > +...... cardxx > +...... ... > + > +Where: > + cardxx is adapter number xx (in hex) > + yyyy is a usage domain number yyyy (in hex) > +....xx.yyyy is APQN (xx,yyyy) > + > +For example, if AP adapters 5 and 6 and domains 4 and 71 are configured for the > +LPAR, the sysfs representation on the linux system would look like this: > + > +/sys/bus/ap > +... [devices] > +...... 05.0004 > +...... 05.0047 > +...... 06.0004 > +...... 06.0047 > +...... card05 > +...... card06 > + > +There will also be AP device drivers created to control each type of AP matrix > +interface available to the IBM Z system: > + > +/sys/bus/ap > +... [drivers] > +...... [cex2acard] for Crypto Express 2/3 accelerator cards > +...... [cex2aqueue] for AP queues served by Crypto Express 2/3 > + accelerator cards > +...... [cex4card] for Crypto Express 4/5/6 accelerator and coprocessor > + cards > +...... [cex4queue] for AP queues served by Crypto Express 4/5/6 > + accelerator and coprocessor cards > +...... [pcixcccard] for Crypto Express 2/3 coprocessor cards > +...... [pcixccqueue] for AP queues served by Crypto Express 2/3 > + coprocessor cards > + > +Links to the AP interfaces controlled by each AP device driver will be created > +in the device driver's sysfs directory. For example, if AP adapter 5 and domains > +4 and 71 (0x47) are assigned to the LPAR and adapter 5 is a CEX5 card, the > +following links will be created in the CEX5 drivers' sysfs directories: > + > +/sys/bus/ap > +... [drivers] > +...... [cex4card] > +......... [card05] > +...... [cex4queue] > +......... [05.0004] > +......... [05.0047] > + > +AP Matrix Configuration for a Linux Guest: > +========================================= > +In order to configure the AP matrix for a guest, the adapters, usage domains > +and control domains to be used by the guest must be identified. This section > +describes how to configure a guest's AP matrix. > + > +When the linux host is booted, an AP matrix bus will be initialized. When > +initialized, the AP matrix bus creates a single AP matrix device to > +hold the APQNs to be made available to guests: > + > +/sys/bus/ap_matrix > +... [devices] > +......[matrix] symlink to the AP matrix device directory > + > +/sys/devices > +... [ap_matrix] > +......[matrix] the AP matrix device directory > + > +The kernel interfaces for configuring an AP matrix for a linux guest are built > +on the VFIO mediated device framework and are provided by the vfio_ap_matrix > +kernel module. The dependency chain for the vfio_ap_matrix module is: > + > +* vfio > +* mdev > +* vfio_mdev > +* vfio_ap_matrix > + > +When the vfio_ap_matrix module is loaded, it will create the following sysfs > +interfaces: > + > +/sys/bus/ap > +... [drivers] > +...... [vfio_ap_matrix] > +......... bind > + > +The vfio_ap_matrix device driver is created to provide an interface for securing > +APQNs from use by the host linux system. This is accomplished by unbinding the > +APQNs from the host device driver and binding them to the vfio_ap_matrix > +device driver. For example, suppose we want to secure APQN (05,0004). Assuming > +for this example that AP adapter card 5 is a CEX5 coprocessor card: > + > + echo 05.0004 > /sys/bus/ap/drivers/cex4queue/unbind > + echo 05.0004 > /sys/bus/ap/drivers/vfio_ap_matrix/bind > + > +This action will store the APQN in the /sys/devices/ap_matrix/matrix device > +which makes it available for use by a linux guest. > + > +Another side effect of loading the vfio_ap_matrix module is the creation of the > +sysfs interfaces for configuring an AP matrix for a linux guest. These sysfs > +interfaces are built on the VFIO mediated device framework. To configure an AP > +matrix for a guest, a mediated matrix device must be created for the > +/sys/devices/ap_matrix/matrix device. A mediated matrix device must be created > +for each guest that needs access to one or more AP queues. The sysfs interface > +for creating a mediated matrix device is in: > + > +/sys/devices > +... [ap_matrix] > +......[matrix] > +......... [mdev_supported_types] > +............ [ap_matrix-passthrough] > +............... create > +............... [devices] > + > +A mediated AP matrix device is created by writing a UUID to the attribute > +file named 'create', for example: > + > + uuidgen > create > + > +When a mediated AP matrix device is created, a sysfs directory named after > +the UUID will be created in the devices subdirectory: > + > +/sys/devices > +... [ap_matrix] > +......[matrix] > +......... [mdev_supported_types] > +............ [ap_matrix-passthrough] > +............... create > +............... [devices] > +.................. [$uuid] > +..................... adapters > +..................... assign_adapter > +..................... assign_control_domain > +..................... assign_domain > +..................... control_domains > +..................... domains > +..................... remove > +..................... unassign_adapter > +..................... unassign_control_domain > +..................... unassign_domain > + > +There will also be three sets of attribute files created in the mediated matrix > +device's sysfs directory: > + > +1 Adapter assignment > + * An adapter is assigned by writing the adapter's number into the > + 'assign_adapter' file. This may be repeated multiple times to assign > + multiple adapters. For example, to assign adapters 5 and 6 to mediated > + matrix device $uuid: > + > + echo 5 > assign_adapter > + echo 6 > assign_adapter > + > + * An adapter may be unassigned by writing the adapter's number into the > + 'unassign_adapter' file. This may also be done multiple times to > + unassign multiple adapters. > + > + * To view the adapter numbers assigned to the AP matrix mediated device, > + print the 'adapters' file: > + > + cat adapters > + > +1 Usage Domain assignment > + * A usage domain is assigned by writing the usage domain's number into the > + 'assign_domain' file. This may be repeated multiple times to assign > + multiple usage domains. For example, to assign usage domains 4 and > + 71 (0x47) to mediated matrix device $uuid: > + > + echo 4 > assign_domain > + echo 47 > assign_domain > + > + * A domain may be unassigned by writing the usage domain's number into the > + 'unassign_domain' file. This may be repeated multiple times to unassign > + multiple usage domains. > + > + * To view the usage domain numbers assigned to the AP matrix mediated > + device, print the 'domains' file: > + > + cat domains > + > +1 Control domain assignment > + * A control domain is assigned by writing the control domain's number into > + the 'assign_control_domain' file. This may be repeated multiple times to > + assign multiple control domains. It is not necessary to assign > + usage domain numbers as control domains, that is done automatically by > + default. To assign control domains 4 and 37 (0x35) to mediated matrix > + device $uuid: > + > + echo 4 > assign_control_domain > + echo 25 > assign_control_domain > + > + * A control domain may be unassigned by writing the control domain's number > + into the 'unassign_control_domain' file. This may be repeated multiple > + times to unassign multiple control domains. > + > + * To view the control domain numbers assigned to the AP matrix mediated > + device, print the 'control_domains' file: > + > + cat control_domains > + > +Note: Hot plug/unplug is not currently supported for mediated AP matrix devices, > + so the AP matrix resulting from assignment and/or unassignment of AP > + adapters, usage domains and control domains to a mediated AP matrix device > + will not take affect until the linux guest is rebooted. > + > +Starting a Linux Guest Configured with an AP Matrix: > +=================================================== > +In addition to providing the sysfs interfaces for configuring the AP matrix for > +a linux guest, a mediated AP matrix device also acts as a communication pathway > +between QEMU and the vfio_ap_matrix device driver. To gain access to the > +device driver, the following option must be specified on the QEMU command line: > + > +-device vfio_ap_matrix,sysfsdev=$path-to-mdev > + > +The sysfsdev parameter specifies the path to the mediated matrix device. > +There are a number of ways to specify this path: > + > +/sys/devices/ap_matrix/matrix/$uuid > +/sys/bus/mdev/devices/$uuid > +/sys/bus/mdev/drivers/vfio_mdev/$uuid > +/sys/devices/ap_matrix/matrix/mdev_supported_types/ap_matrix-passthrough/devices/$uuid > + > +When the linux guest is subsequently started, the guest will open the mediated > +matrix device's file descriptor to issue the command instructing the device > +driver to configure the AP matrix for the linux guest. In response, the > +vfio_ap_matrix device driver will update the APM, AQM, and ADM fields in the > +guest's CRYCB with the adapter, usage domain and control domain numbers > +specified via the mediated matrix device's sysfs attribute files. Programs > +running on the linux guest will then: > + > +1. Have access to the APQNs derived from the intersection of the AP adapter and > + usage domain numbers specified in the APM and AQM respectively > + > +2. Have authorization to process AP commands to change a control domains > + identified in an AP instruction sent to a valid APQN. > + > +Example: Configure AP Matrices for Two Linux Guests: > +=================================================== > +Let's now provide an example to illustrate how KVM guests may be given > +direct access to APQNs. For this example, we will illustrate how to configure > +two guests such that executing the lszcrypt command on the guests would > +look like this: > + > +Guest1 > +------ > +CARD.DOMAIN TYPE MODE > +------------------------------ > +05 CEX5C CCA-Coproc > +05.0004 CEX5C CCA-Coproc > +05.00ab CEX5C CCA-Coproc > +06 CEX5A Accelerator > +06.0004 CEX5A Accelerator > +06.00ab CEX5C CCA-Coproc > + > +Guest2 > +------ > +CARD.DOMAIN TYPE MODE > +------------------------------ > +05 CEX5A Accelerator > +05.0047 CEX5A Accelerator > +05.00ff CEX5A Accelerator > + > +These are the steps for configuring Guest1 and Guest2: > + > +1. The first thing that needs to be done is to unbind each AP Queue device from > + its respective AP device driver to prevent access from the host linux system > + and to reserve it for use by a linux guest. For our example, let's assume > + the AP queues are bound to the cex4queue driver. > + > + /sys/bus/ap > + --- [drivers] > + ------ [cex4queue] > + --------- [05.0004] > + --------- [05.0047] > + --------- [05.00ab] > + --------- [05.00ff] > + --------- [06.0004] > + --------- [06.00ab] > + --------- unbind > + > + To unbind AP queue 05.0004 from the cex4queue device driver: > + > + echo 05.0004 > unbind > + > + This must also be done for AP queues 05.00ab, 05.0047, 05.00ff, 06.0004, > + and 06.00ab. > + > +2. The next step is to reserve the queues for use by the two KVM guests. > + This is accomplished by binding them to the VFIO AP matrix device driver: > + > + /sys/bus/ap > + ---[drivers] > + ------ [vfio_ap_matrix] > + ---------- bind > + > + For Guest1: > + > + echo 05.0004 > bind > + echo 05.00ab > bind > + echo 06.0004 > bind > + echo 06.00ab > bind > + > + For Guest2: > + > + echo 05.0047 > bind > + echo 05.00ff > bind > + > +3. Create the mediated matrix devices needed to configure the AP matrices for > + and to provide an interface to the vfio_ap_matrix driver for use by the > + two guests: > + > + /sys/devices/ > + --- [ap_matrix] > + ------ [matrix] (this is the AP matrix device) > + --------- [mdev_supported_types] > + ------------ [ap_matrix-passthrough] (the mediated device type) > + --------------- create > + --------------- [devices] > + > + To create the mediated devices for the two guests: > + > + uuidgen > create > + uuidgen > create > + > + This will create two mediated devices in the [devices] subdirectory named > + with the UUID written to the create attribute file. We call them $uuid1 > + and $uuid2: > + > + /sys/devices/ > + --- [ap_matrix] > + ------ [matrix] > + --------- [mdev_supported_types] > + ------------ [ap_matrix-passthrough] > + --------------- [devices] > + ------------------ [$uuid1] > + --------------------- adapters > + --------------------- assign_adapter > + --------------------- assign_control_domain > + --------------------- assign_domain > + --------------------- control_domains > + --------------------- domains > + --------------------- unassign_adapter > + --------------------- unassign_control_domain > + --------------------- unassign_domain > + ------------------ [$uuid2] > + --------------------- adapters > + --------------------- assign_adapter > + --------------------- assign_control_domain > + --------------------- assign_domain > + --------------------- control_domains > + --------------------- domains > + --------------------- unassign_adapter > + --------------------- unassign_control_domain > + --------------------- unassign_domain > + > +4. The administrator now needs to configure the matrices for mediated > + devices $uuid1 (for Guest1) and $uuid2 (for Guest2). > + > + For Guest1: > + cd /sys/devices/ap_matrix/matrix/mdev_supported_types/ap_matrix_passthrough > + cd ./devices/$uuid1: > + > + echo 5 > assign_adapter > + echo 6 > assign_adapter > + echo 4 > assign_domain > + echo ab > assign_domain > + > + For Guest2: > + cd /sys/devices/ap_matrix/matrix/mdev_supported_types/ap_matrix_passthrough > + cd ./devices/$uuid2: > + > + echo 5 > assign_adapter > + echo 47 > assign_domain > + echo ff > assign_domain > + > + By architectural convention, all usage domains - i.e., domains assigned > + via the assign_domain attribute file - will also be configured in the ADM > + field of the KVM guest's CRYCB, so there is no need to assign control > + domains here unless you want to assign control domains that are not > + assigned as usage domains. > + > +5. Start Guest1 > + > + /usr/bin/qemu-system-s390x ... -device vfio_ap_matrix,sysfsdev=/sys/devices/ap_matrix/matrix/$uuid1 ... > + > +6. Start Guest2 > + > + /usr/bin/qemu-system-s390x ... -device vfio_ap_matrix,sysfsdev=/sys/devices/ap_matrix/matrix/$uuid2 ... > \ No newline at end of file Please add a newline :) I think this document can be improved by some ascii art for the matrices. Especially if you put in a matrix for the host view, two matrices for two well-configured guests and two matrices for two guests with a bad (conflicting) configuration. That makes it more clear why we need this interface.