Re: [PATCH] KVM test: Add PCI device assignment support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Dec 27, 2009 at 09:55:56PM -0200, Lucas Meneghel Rodrigues wrote:
> Add support to PCI device assignment on the kvm test. It supports
> both SR-IOV virtual functions and physical NIC card device
> assignment.
> 
> Single Root I/O Virtualization (SR-IOV) allows a single PCI device to
> be shared amongst multiple virtual machines while retaining the
> performance benefit of assigning a PCI device to a virtual machine.
> A common example is where a single SR-IOV capable NIC - with perhaps
> only a single physical network port - might be shared with multiple
> virtual machines by assigning a virtual function to each VM.
> 
> SR-IOV support is implemented in the kernel. The core implementation
> is contained in the PCI subsystem, but there must also be driver support
> for both the Physical Function (PF) and Virtual Function (VF) devices.
> With an SR-IOV capable device one can allocate VFs from a PF. The VFs
> surface as PCI devices which are backed on the physical PCI device by
> resources (queues, and register sets).
> 
> Device support:
> 
> In 2.6.30, the Intel® 82576 Gigabit Ethernet Controller is the only
> SR-IOV capable device supported. The igb driver has PF support and the
> igbvf has VF support.
> 
> In 2.6.31 the Neterion® X3100™ is supported as well. This device uses
> the same vxge driver for the PF as well as the VFs.
> 
> In order to configure the test:
> 
>    * For SR-IOV virtual functions passthrough, we could specify the
>      module parameter 'max_vfs' in config file.
>    * For physical NIC card pass through, we should specify the device
>      name(s).
> 
> 4th try: Implemented Yolkfull's suggestion of keeping 'max_vfs' and
> 'assignable_devices' as sepparated parameters. Yolkfull, please test this
> on your environment. Thank you!

Hi Lucas,
Sorry for the late reply. I just tested this patch and found some problems,
please see comments below:

> 
>   * Naming is consistent with "PCI assignment" instead of
>     "PCI passthrough", as it's a more correct term.
>   * No more device database file, as all information about devices
>     is stored on an attribute of the VM class (an instance of the
>     PciAssignable class), so we don't have to bother dumping this
>     info to a file.
>   * Code simplified to avoid duplication
> 
> As it's a fairly involved feature, the more reviews we get the better.
> 
> Signed-off-by: Lucas Meneghel Rodrigues <lmr@xxxxxxxxxx>
> ---
>  client/tests/kvm/kvm_utils.py          |  281 ++++++++++++++++++++++++++++++++
>  client/tests/kvm/kvm_vm.py             |   59 +++++++
>  client/tests/kvm/tests_base.cfg.sample |   20 +++
>  3 files changed, 360 insertions(+), 0 deletions(-)
> 
> diff --git a/client/tests/kvm/kvm_utils.py b/client/tests/kvm/kvm_utils.py
> index 2bbbe22..59c72a9 100644
> --- a/client/tests/kvm/kvm_utils.py
> +++ b/client/tests/kvm/kvm_utils.py
> @@ -924,3 +924,284 @@ def create_report(report_dir, results_dir):
>      reporter = os.path.join(report_dir, 'html_report.py')
>      html_file = os.path.join(results_dir, 'results.html')
>      os.system('%s -r %s -f %s -R' % (reporter, results_dir, html_file))
> +
> +
> +def get_full_pci_id(pci_id):
> +    """
> +    Get full PCI ID of pci_id.
> +
> +    @param pci_id: PCI ID of a device.
> +    """
> +    cmd = "lspci -D | awk '/%s/ {print $1}'" % pci_id
> +    status, full_id = commands.getstatusoutput(cmd)
> +    if status != 0:
> +        return None
> +    return full_id
> +
> +
> +def get_vendor_from_pci_id(pci_id):
> +    """
> +    Check out the device vendor ID according to pci_id.
> +
> +    @param pci_id: PCI ID of a device.
> +    """
> +    cmd = "lspci -n | awk '/%s/ {print $3}'" % pci_id
> +    return re.sub(":", " ", commands.getoutput(cmd))
> +
> +
> +class PciAssignable(object):
> +    """
> +    Request PCI assignable devices on host. It will check whether to request
> +    PF (physical Functions) or VF (Virtual Functions).
> +    """
> +    def __init__(self, type="nic_vf", driver=None, driver_option=None,
> +                 names=None, devices_requested=None):
> +        """
> +        Initialize parameter 'type' which could be:
> +        nic_vf: Virtual Functions
> +        nic_pf: Physical Function (actual hardware)
> +        mixed:  Both includes VFs and PFs
> +
> +        If pass through Physical NIC cards, we need to specify which devices
> +        to be assigned, e.g. 'eth1 eth2'.
> +
> +        If pass through Virtual Functions, we need to specify how many vfs
> +        are going to be assigned, e.g. passthrough_count = 8 and max_vfs in
> +        config file.
> +
> +        @param type: PCI device type.
> +        @param driver: Kernel module for the PCI assignable device.
> +        @param driver_option: Module option to specify the maximum number of
> +                VFs (eg 'max_vfs=7')
> +        @param names: Physical NIC cards correspondent network interfaces,
> +                e.g.'eth1 eth2 ...'

Add parameter interpretation for 'devices_requested'.

> +        """
> +        self.type = type
> +        self.driver = driver
> +        self.driver_option = driver_option
> +        if names:
> +            self.name_list = names.split()
> +        if devices_requested:
> +            self.devices_requested = int(devices_requested)

We need anyway to initialize 'self.devices_requested' since following codes use
this attribution, like:

...
def check_vfs_count(self):
> +        """
> +        Check VFs count number according to the parameter driver_options.
> +        """
> +        return (self.get_vfs_count == self.devices_requested)
...

> +
> +
> +    def _get_pf_pci_id(self, name, search_str):
> +        """
> +        Get the PF PCI ID according to name.
> +
> +        @param name: Name of the PCI device.
> +        @param search_str: Search string to be used on lspci.
> +        """
> +        cmd = "ethtool -i %s | awk '/bus-info/ {print $2}'" % name
> +        s, pci_id = commands.getstatusoutput(cmd)
> +        if not (s or "Cannot get driver information" in pci_id):
> +            return pci_id[5:]
> +        cmd = "lspci | awk '/%s/ {print $1}'" % search_str
> +        pci_ids = [id for id in commands.getoutput(cmd).splitlines()]
> +        nic_id = int(re.search('[0-9]+', name).group(0))
> +        if (len(pci_ids) - 1) < nic_id:
> +            return None
> +        return pci_ids[nic_id]
> +
> +
> +    def _release_dev(self, pci_id):
> +        """
> +        Release a single PCI device.
> +
> +        @param pci_id: PCI ID of a given PCI device.
> +        """
> +        base_dir = "/sys/bus/pci"
> +        full_id = get_full_pci_id(pci_id)
> +        vendor_id = get_vendor_from_pci_id(pci_id)
> +        drv_path = os.path.join(base_dir, "devices/%s/driver" % full_id)
> +        if 'pci-stub' in os.readlink(drv_path):
> +            cmd = "echo '%s' > %s/new_id" % (vendor_id, drv_path)
> +            if os.system(cmd):
> +                return False
> +
> +            stub_path = os.path.join(base_dir, "drivers/pci-stub")
> +            cmd = "echo '%s' > %s/unbind" % (full_id, stub_path)
> +            if os.system(cmd):
> +                return False
> +
> +            driver = self.dev_drivers[pci_id]
> +            cmd = "echo '%s' > %s/bind" % (full_id, driver)
> +            if os.system(cmd):
> +                return False
> +
> +        return True
> +
> +
> +    def get_vf_devs(self):
> +        """
> +        Catch all VFs PCI IDs.
> +
> +        @return: List with all PCI IDs for the Virtual Functions avaliable
> +        """
> +        if not self.sr_iov_setup():
> +            return []
> +
> +        cmd = "lspci | awk '/Virtual Function/ {print $1}'"
> +        return commands.getoutput(cmd).split()
> +
> +
> +    def get_pf_devs(self):
> +        """
> +        Catch all PFs PCI IDs.
> +
> +        @return: List with all PCI IDs for the physical hardware requested
> +        """
> +        pf_ids = []
> +        for name in self.name_list:
> +            pf_id = self._get_pf_pci_id(name, "Ethernet")
> +            if not pf_id:
> +                continue
> +            pf_ids.append(pf_id)
> +        return pf_ids
> +
> +
> +    def get_devs(self, count):
> +        """
> +        Check out all devices' PCI IDs according to their name.
> +
> +        @param count: count number of PCI devices needed for pass through
> +        @return: a list of all devices' PCI IDs
> +        """
> +        if self.type == "nic_vf":
> +            vf_ids = self.get_vf_devs()
> +        elif self.type == "nic_pf":
> +            vf_ids = self.get_pf_devs()
> +        elif self.type == "mixed":
> +            vf_ids = self.get_vf_devs()
> +            vf_ids.extend(self.get_pf_devs())
> +        return vf_ids[0:count]
> +
> +
> +    def get_vfs_count(self):
> +        """
> +        Get VFs count number according to lspci.
> +        """
> +        cmd = "lspci | grep 'Virtual Function' | wc -l"
> +        # For each VF we'll see 2 prints of 'Virtual Function', so let's
> +        # divide the result per 2
> +        return int(commands.getoutput(cmd)) / 2
> +
> +
> +    def check_vfs_count(self):
> +        """
> +        Check VFs count number according to the parameter driver_options.
> +        """
> +        return (self.get_vfs_count == self.devices_requested)
> +
> +
> +    def is_binded_to_stub(self, full_id):
> +        """
> +        Verify whether the device with full_id is already binded to pci-stub.
> +
> +        @param full_id: Full ID for the given PCI device
> +        """
> +        base_dir = "/sys/bus/pci"
> +        stub_path = os.path.join(base_dir, "drivers/pci-stub")
> +        if os.path.exists(os.path.join(stub_path, full_id)):
> +            return True
> +        return False
> +
> +
> +    def sr_iov_setup(self):
> +        """
> +        Ensure the PCI device is working in sr_iov mode.
> +
> +        Check if the PCI hardware device drive is loaded with the appropriate,
> +        parameters (number of VFs), and if it's not, perform setup.
> +
> +        @return: True, if the setup was completed successfuly, False otherwise.
> +        """
> +        re_probe = False
> +        s, o = commands.getstatusoutput('lsmod | grep %s' % self.driver)
> +        if s:
> +            re_probe = True
> +        elif not self.check_vfs_count():
> +            os.system("modprobe -r %s" % self.driver)
> +            re_probe = True
> +
> +        # Re-probe driver with proper number of VFs
> +        if re_probe:
> +            cmd = "modprobe %s %s" % (self.driver, self.driver_option)
> +            s, o = commands.getstatusoutput(cmd)
> +            if s:
> +                return False
> +            if not self.check_vfs_count():
> +                return False
> +            return True
> +
> +
> +    def request_devs(self):
> +        """
> +        Implement setup process: unbind the PCI device and then bind it
> +        to the pci-stub driver.
> +
> +        @return: a list of successfully requested devices' PCI IDs.
> +        """
> +        base_dir = "/sys/bus/pci"
> +        stub_path = os.path.join(base_dir, "drivers/pci-stub")
> +
> +        self.pci_ids = self.get_devs(self.devices_requested)
> +        logging.debug("The following pci_ids were found: %s" % self.pci_ids)
> +        requested_pci_ids = []
> +        self.dev_drivers = {}
> +
> +        # Setup all devices specified for assignment to guest
> +        for pci_id in self.pci_ids:
> +            full_id = get_full_pci_id(pci_id)
> +            if not full_id:
> +                continue
> +            drv_path = os.path.join(base_dir, "devices/%s/driver" % full_id)
> +            dev_prev_driver= os.path.realpath(os.path.join(drv_path,
> +                                              os.readlink(drv_path)))
> +            self.dev_drivers[pci_id] = dev_prev_driver
> +
> +            # Judge whether the device driver has been binded to stub
> +            if not self.is_binded_to_stub(full_id):
> +                logging.debug("Binding device %s to stub" % full_id)
> +                vendor_id = get_vendor_from_pci_id(pci_id)
> +                stub_new_id = os.path.join(stub_path, 'new_id')
> +                unbind_dev = os.path.join(drv_path, 'unbind')
> +                stub_bind = os.path.join(stub_path, 'bind')
> +
> +                info_write_to_files = [(vendor_id, stub_new_id),
> +                                       (full_id, unbind_dev),
> +                                       (full_id, stub_bind)]
> +
> +                for content, file in info_write_to_files:
> +                    try:
> +                        utils.open_write_close(content, file)
> +                    except IOError:
> +                        logging.debug("Failed to write %s to file %s" %
> +                                      (content, file))
> +                        continue
> +
> +                if not self.is_binded_to_stub(full_id):
> +                    logging.error("Binding device %s to stub failed" %
> +                                  pci_id)
> +                    continue
> +            else:
> +                logging.debug("Device %s already binded to stub" % pci_id)
> +            requested_pci_ids.append(pci_id)
> +        self.pci_ids = requested_pci_ids
> +        return self.pci_ids
> +
> +
> +    def release_devs(self):
> +        """
> +        Release all PCI devices currently assigned to VMs back to the
> +        virtualization host.
> +        """
> +        try:
> +            for pci_id in self.dev_drivers:
> +                if not self._release_dev(pci_id):
> +                    logging.error("Failed to release device %s to host" %
> +                                  pci_id)
> +                else:
> +                    logging.info("Released device %s successfully" % pci_id)
> +        except:
> +            return
> diff --git a/client/tests/kvm/kvm_vm.py b/client/tests/kvm/kvm_vm.py
> index cc314d4..a86c124 100755
> --- a/client/tests/kvm/kvm_vm.py
> +++ b/client/tests/kvm/kvm_vm.py
> @@ -304,6 +304,12 @@ class VM:
>          elif params.get("uuid"):
>              qemu_cmd += " -uuid %s" % params.get("uuid")
>  
> +        # If the PCI assignment step went OK, add each one of the PCI assigned
> +        # devices to the qemu command line.
> +        if self.pci_assignable:
> +            for pci_id in self.pa_pci_ids:
> +                qemu_cmd += " -pcidevice host=%s" % pci_id
> +
>          return qemu_cmd
>  
>  
> @@ -392,6 +398,50 @@ class VM:
>                  self.uuid = f.read().strip()
>                  f.close()
>  
> +            if not params.get("pci_assignable") == "no":
> +                pa_type = params.get("pci_assignable")
> +                pa_devices_requested = params.get("devices_requested")
> +
> +                # Virtual Functions (VF) assignable devices
> +                if pa_type == "vf":
> +                    pa_driver = params.get("driver")
> +                    pa_driver_option = params.get("driver_option")
> +                    self.pci_assignable = kvm_utils.PciAssignable(type=pa_type,
> +                                        driver=pa_driver,
> +                                        driver_option=pa_driver_option,
> +                                        devices_requested=pa_devices_requested)
> +                # Physical NIC (PF) assignable devices
> +                elif pa_type == "pf":
> +                    pa_device_names = params.get("device_names")
> +                    self.pci_assignable = kvm_utils.PciAssignable(type=pa_type,
> +                                         names=pa_device_names,
> +                                         devices_requested=pa_devices_requested)
> +                # Working with both VF and PF
> +                elif pa_type == "mixed":
> +                    pa_device_names = params.get("device_names")
> +                    pa_driver = params.get("driver")
> +                    pa_driver_option = params.get("driver_option")
> +                    self.pci_assignable = kvm_utils.PciAssignable(type=pa_type,
> +                                        driver=pa_driver,
> +                                        driver_option=pa_driver_option,
> +                                        names=pa_device_names,
> +                                        devices_requested=pa_devices_requested)
> +
> +                self.pa_pci_ids = self.pci_assignable.request_devs()
> +
> +                if self.pa_pci_ids:
> +                    logging.debug("Successfuly assigned devices: %s" %
> +                                  self.pa_pci_ids)
> +                else:
> +                    logging.error("No PCI assignable devices were assigned "
> +                                  "and 'pci_assignable' is defined to %s "
> +                                  "on your config file. Aborting VM creation." %
> +                                  pa_type)
> +                    return False
> +
> +            else:
> +                self.pci_assignable = None

It's weird that even though we initialize 'self.pci_assignable' when params.get("pci_assignable") == "no", we still
got backtrace:

03:27:38 ERROR| child process failed
03:27:38 INFO | 		FAIL	kvm.Fedora.11.64.boot	kvm.Fedora.11.64.boot	timestamp=1263284858	localtime=Jan 12 03:27:38	Unhandled AttributeError: VM instance has no attribute 'pci_assignable'
		  Traceback (most recent call last):
		    File "/root/pci_assign/client/common_lib/test.py", line 595, in _call_test_function
		      return func(*args, **dargs)
		    File "/root/pci_assign/client/common_lib/test.py", line 281, in execute
		      postprocess_profiled_run, args, dargs)
		    File "/root/pci_assign/client/common_lib/test.py", line 202, in _call_run_once
		      self.run_once(*args, **dargs)
		    File "/root/pci_assign/client/tests/kvm/kvm.py", line 69, in run_once
		      kvm_preprocessing.postprocess(self, params, env)
		    File "/root/pci_assign/client/tests/kvm/kvm_preprocessing.py", line 271, in postprocess
		      process(test, params, env, postprocess_image, postprocess_vm)
		    File "/root/pci_assign/client/tests/kvm/kvm_preprocessing.py", line 178, in process
		      vm_func(test, vm_params, env, vm_name)
		    File "/root/pci_assign/client/tests/kvm/kvm_preprocessing.py", line 126, in postprocess_vm
		      vm.destroy(gracefully = params.get("kill_vm_gracefully") == "yes")
		    File "/root/pci_assign/client/tests/kvm/kvm_vm.py", line 590, in destroy
		      if self.pci_assignable:
		  AttributeError: VM instance has no attribute 'pci_assignable'
03:27:38 INFO | 	END FAIL	kvm.Fedora.11.64.boot	kvm.Fedora.11.64.boot	timestamp=1263284858	localtime=Jan 12 03:27:38	


So I added 'self.pci_assignable = None' in __init__() and fixed the problem.

> +
>              # Make qemu command
>              qemu_command = self.make_qemu_command()
>  
> @@ -537,6 +587,8 @@ class VM:
>              # Is it already dead?
>              if self.is_dead():
>                  logging.debug("VM is already down")
> +                if self.pci_assignable:
> +                    self.pci_assignable.release_devs()
>                  return
>  
>              logging.debug("Destroying VM with PID %d..." %
> @@ -557,6 +609,9 @@ class VM:
>                              return
>                      finally:
>                          session.close()
> +                        if self.pci_assignable:
> +                            self.pci_assignable.release_devs()
> +
>  
>              # Try to destroy with a monitor command
>              logging.debug("Trying to kill VM with monitor command...")
> @@ -566,6 +621,8 @@ class VM:
>                  # Wait for the VM to be really dead
>                  if kvm_utils.wait_for(self.is_dead, 5, 0.5, 0.5):
>                      logging.debug("VM is down")
> +                    if self.pci_assignable:
> +                        self.pci_assignable.release_devs()
>                      return
>  
>              # If the VM isn't dead yet...
> @@ -575,6 +632,8 @@ class VM:
>              # Wait for the VM to be really dead
>              if kvm_utils.wait_for(self.is_dead, 5, 0.5, 0.5):
>                  logging.debug("VM is down")
> +                if self.pci_assignable:
> +                    self.pci_assignable.release_devs()
>                  return
>  
>              logging.error("Process %s is a zombie!" % self.process.get_pid())
> diff --git a/client/tests/kvm/tests_base.cfg.sample b/client/tests/kvm/tests_base.cfg.sample
> index a403399..b7ee2e1 100644
> --- a/client/tests/kvm/tests_base.cfg.sample
> +++ b/client/tests/kvm/tests_base.cfg.sample
> @@ -884,3 +884,23 @@ variants:
>          pre_command = "/usr/bin/python scripts/hugepage.py /mnt/kvm_hugepage"
>          extra_params += " -mem-path /mnt/kvm_hugepage"
>  
> +
> +variants:
> +    - @no_pci_assignable:
> +        pci_assignable = no
> +    - pf_assignable:
> +        pci_assignable = pf
> +        device_names = eth1
> +    - vf_assignable:
> +        pci_assignable = vf
> +        # Driver (kernel module) that supports SR-IOV hardware.
> +        # As of today (30-11-2009), we have 2 drivers for this type of hardware:
> +        # Intel® 82576 Gigabit Ethernet Controller - igb
> +        # Neterion® X3100™ - vxge
> +        driver = igb
> +        # Driver option to specify the maximum number of virtual functions
> +        # (on vxge the option is , for example, is max_config_dev)
> +        # the default below is for the igb driver
> +        driver_option = "max_vfs=7"
> +        # Number of devices that are going to be requested.
> +        devices_requested = 7
> -- 
> 1.6.5.2

After fixing above two problems, patch runs good:

# ./scan_results.py 
Test                                            		Status	Seconds	Info
----                                            		------	-------	----
        (Result file: ../../results/default/status)
Fedora.11.64.boot                               		GOOD	51	completed successfully
pf_assignable.Fedora.11.64.boot                 		GOOD	19	completed successfully
vf_assignable.Fedora.11.64.boot                 		GOOD	14	completed successfully
----                                            		GOOD	117	

Thanks very much for improving this patch. :-)

> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux