[Bug 76331] kernel BUG at drivers/iommu/intel-iommu.c:844!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



https://bugzilla.kernel.org/show_bug.cgi?id=76331

Alex Williamson <alex.williamson@xxxxxxxxxx> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dwmw2@xxxxxxxxxxxxx

--- Comment #4 from Alex Williamson <alex.williamson@xxxxxxxxxx> ---
The DRHD capability registers are reported as:

IOMMU 0: c9008010e60262
IOMMU 1: c9078010ef0462

101111
100110

>From the VT-d spec (v2.2), bits 12:8 of the capability register are the
Supported Adjusted Guest Address Widths (SAGAW), defined as:

  This 5-bit field indicates the supported adjusted
  guest address widths (which in turn represents
  the levels of page-table walks for the 4KB base
  page size) supported by the hardware
  implementation.

  A value of 1 in any of these bits indicates the
  corresponding adjusted guest address width is
  supported. The adjusted guest address widths
  corresponding to various bit positions within this
  field are:
    • 0: Reserved
    • 1: 39-bit AGAW (3-level page-table)
    • 2: 48-bit AGAW (4-level page-table)
    • 3: Reserved
    • 4: Reserved

  Software must ensure that the adjusted guest
  address width used to set up the page tables is
  one of the supported guest address widths
  reported in this field.

This system therefore has one DRHD unit supporting 3-level page tables (IOMMU
0) and the other supporting 4-level page tables (IOMMU 1).

Bits 21:16 are the Maximum Guest Address Width:

  This field indicates the maximum DMA virtual
  addressability supported by remapping
  hardware. The Maximum Guest Address Width
  (MGAW) is computed as (N+1), where N is the
  valued reported in this field. For example, a
  hardware implementation supporting 48-bit
  MGAW reports a value of 47 (101111b) in this
  field.

  If the value in this field is X, untranslated and
  translated DMA requests to addresses above
  2(X+1)-1 are always blocked by hardware.
  Device-TLB translation requests to address
  above 2(X+1)-1 from allowed devices return a
  null Translation-Completion Data with R=W=0.
  Guest addressability for a given DMA request is
  limited to the minimum of the value reported
  through this field and the adjusted guest
  address width of the corresponding page-table
  structure. (Adjusted guest address widths
  supported by hardware are reported through
  the SAGAW field).

  Implementations must support MGAW at least
  equal to the physical addressability (host
  address width) of the platform.

On this system, IOMMU 0 therefore has a MGAW of 0x26 + 1 = 39 bits, IOMMU 1 =
0x2f + 1 = 48 bits.

The BUG we're hitting is:

BUG_ON(addr_width < BITS_PER_LONG && last_pfn >> addr_width);

So the last PFN of the domain is beyond the address width of the domain.

last_pfn here is created from DOMAIN_MAX_PFN(domain->gaw)

All VM domains are created with a 48 bit width (domain->gaw):

#define DEFAULT_DOMAIN_ADDRESS_WIDTH 48

So the default last_pfn is 0xf_ffff_ffff

Given the default 48 bit width, the default domain AGAW (Adjusted Guest Address
Width) is 2 (domain->agaw)

When we add devices to the domain, the gaw is updated to match:

        /* check if this iommu agaw is sufficient for max mapped address */
        addr_width = agaw_to_width(iommu->agaw);
        if (addr_width > cap_mgaw(iommu->cap))
                addr_width = cap_mgaw(iommu->cap);

        if (dmar_domain->max_addr > (1LL << addr_width)) {
                printk(KERN_ERR "%s: iommu width (%d) is not "
                       "sufficient for the mapped address (%llx)\n",
                       __func__, addr_width, dmar_domain->max_addr);
                return -EFAULT;
        }
        dmar_domain->gaw = addr_width;

iommu->agaw is calculated from the SAGAW, and will be either 1 or 2 here
depending on which IOMMU manages the device.  One bug stands out here,
domain->gaw is set to the width of the iommu for the last device added, so an
initial suspicion would be that you could avoid the problem by re-ordering the
qemu command line to create the devices in the reverse order.

So, depending on the order devices were added, domain->gaw is either 48 bits or
39 bits and therefore last_pfn going into the BUG_ON is either 0xf_ffff_ffff or
0x7fff_ffff.

addr_width is set from 'agaw_to_width(domain->agaw) - VTD_PAGE_SHIFT' where
domain->agaw is initially 2, however just beyond the above code snippet we
have:

        /*
         * Knock out extra levels of page tables if necessary
         */
        while (iommu->agaw < dmar_domain->agaw) {
                struct dma_pte *pte;

                pte = dmar_domain->pgd;
                if (dma_pte_present(pte)) {
                        dmar_domain->pgd = (struct dma_pte *)
                                phys_to_virt(dma_pte_addr(pte));
                        free_pgtable_page(pte);
                }
                dmar_domain->agaw--;
        }

Therefore, when we add the device behind the 39 bit IOMMU first, we get:

last_pfn = 0x7fff_ffff
addr_width = 39

but then we add the device behind the 48 bit IOMMU and get:

last_pfn = 0xf_ffff_ffff
addr_width = 39

Resulting in the BUG_ON

The fix might simply be to change setting the GAW here to:

dmar_domain->gaw = min(dmar_domain->gaw, addr_width);

-- 
You are receiving this mail because:
You are watching the assignee of the bug.--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux