>> >> Hi, all >> >> I'm using VFIO to assign intel 82599 VF to VM, now I encounter a problem, >> >> 82599 PF and its VFs belong to the same iommu_group, but I only want to assign some VFs to one VM, and some other VFs to another VM, ..., >> >> so how to only unbind (part of) the VFs but PF? >> >> I read the kernel doc vfio.txt, I'm not sure should I unbind all of the devices which belong to one iommu_group? >> >> If so, because PF and its VFs belong to the same iommu_group, if I unbind the PF, its VFs also diappeared. >> >> I think I misunderstand someting, >> >> any advises? >> > >> >This occurs when the PF is installed behind components in the system >> >that do not support PCIe Access Control Services (ACS). The IOMMU group >> >contains both the PF and the VF because upstream transactions can be >> >re-routed downstream by these non-ACS components before being translated >> >by the IOMMU. Please provide 'sudo lspci -vvv', 'lspci -n', and kernel >> >version and we might be able to give you some advise on how to work >> >around the problem. Thanks, >> > >> # lspci | grep Ether >> 02:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 01) >> 02:00.1 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 01) >> 08:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) >> 08:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) >> 09:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) >> 09:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) >> 0a:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) >> 0a:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) >> 0b:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) >> 0b:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) >> 0c:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8110SC/8169SC Gigabit Ethernet (rev 10) >> >> I want to direct-assign the VFs of intel 82599(02:00.0 or 02:00.1) to VM, >> # lspci -t >> -[0000:00]-+-00.0 >> +-01.0-[01]-- >> +-01.1-[02-03]--+-00.0 >> | \-00.1 >> +-02.0 >> +-06.0-[04]-- >> +-16.0 >> +-1a.0 >> +-1c.0-[05-0b]----00.0-[06-0b]--+-04.0-[07]-- >> | +-05.0-[08]--+-00.0 >> | | \-00.1 >> | +-06.0-[09]--+-00.0 >> | | \-00.1 >> | +-08.0-[0a]--+-00.0 >> | | \-00.1 >> | \-09.0-[0b]--+-00.0 >> | \-00.1 >> +-1d.0 >> +-1e.0-[0c]----00.0 >> +-1f.0 >> +-1f.2 >> \-1f.3 >> >> lspci -vvv -s 02.00.0 >> 02:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection (rev 01) >> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ >> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- >> Latency: 0, Cache Line Size: 64 bytes >> Interrupt: pin A routed to IRQ 17 >> Region 0: Memory at f7e20000 (64-bit, non-prefetchable) [size=128K] >> Region 2: I/O ports at e020 [size=32] >> Region 4: Memory at f7e44000 (64-bit, non-prefetchable) [size=16K] >> Capabilities: [40] Power Management version 3 >> Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ >> Capabilities: [70] MSI-X: Enable+ Count=64 Masked- >> Capabilities: [a0] Express (v2) Endpoint, MSI 00 >> Capabilities: [e0] Vital Product Data >> Capabilities: [100 v1] Advanced Error Reporting >> Capabilities: [140 v1] Device Serial Number 00-90-0b-ff-ff-29-33-c2 >> Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI) >> Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV) >> Kernel driver in use: ixgbe >> >> # lspci -vvv -s 00:01.1 >> 00:01.1 PCI bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor PCI Express Root Port (rev 09) (prog-if 00 [Normal decode]) >> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ >> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- >> Latency: 0, Cache Line Size: 64 bytes >> Bus: primary=00, secondary=02, subordinate=03, sec-latency=0 >> I/O behind bridge: 0000e000-0000efff >> Memory behind bridge: f7e00000-f7efffff >> Prefetchable memory behind bridge: 00000000dfb00000-00000000dfefffff >> Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR- >> BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B- >> PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- >> Capabilities: [88] Subsystem: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor PCI Express Root Port >> Capabilities: [80] Power Management version 3 >> Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit- >> Capabilities: [a0] Express (v2) Root Port (Slot+), MSI 00 >> Capabilities: [100 v1] Virtual Channel >> Capabilities: [140 v1] Root Complex Link >> Capabilities: [d94 v1] #19 >> Kernel driver in use: pcieport >> >> The intel 82599(02:00.0 or 02:00.1) is behind the pci bridge (00:01.1), >> does 00:01.1 PCI bridge support ACS ? > >It does not and that's exactly the problem. We must assume that the >root port can redirect a transaction from a subordinate device back to >another subordinate device without IOMMU translation when ACS support is >not present. If you had a device plugged in below 00:01.0, we'd also >need to assume that non-IOMMU translated peer-to-peer between devices >behind either function, 00:01.0 or 00:01.1, is possible. > >Intel has indicated that processor root ports for all Xeon class >processors should support ACS and have verified isolation for PCH based >root ports allowing us to support quirks in place of ACS support. I'm >not aware of any efforts at Intel to verify isolation capabilities of >root ports on client processors. They are however aware that lack of >ACS is a limiting factor for usability of VT-d, and I hope that we'll >see future products with ACS support. > >Chances are good that the PCH root port at 00:1c.0 is supported by an >ACS quirk, but it seems that your system has a PCIe switch below the >root port. If the PCIe switch downstream ports support ACS, then you >may be able to move the 82599 to the empty slot at bus 07 to separate >the VFs into different IOMMU groups. Thanks, > Thanks, Alex, how to tell whether a PCI bridge/deivce support ACS capability? I perform "lspci -vvv -s | grep -i ACS", nothing matched. # lspci -vvv -s 00:1c.0 00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 (rev b5) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Bus: primary=00, secondary=05, subordinate=0b, sec-latency=0 I/O behind bridge: 00002000-00003fff Memory behind bridge: f7800000-f7cfffff Prefetchable memory behind bridge: 00000000f0000000-00000000f03fffff Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR- BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us ExtTag- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend- LnkCap: Port #1, Speed 5GT/s, Width x4, ASPM L0s L1, Latency L0 <1us, L1 <4us ClockPM- Surprise- LLActRep+ BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk- ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt+ SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise- Slot #0, PowerLimit 25.000W; Interlock- NoCompl+ SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg- Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock- SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock- Changed: MRL- PresDet- LinkState- RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible- RootCap: CRSVisible- RootSta: PME ReqID 0000, PMEStatus- PMEPending- DevCap2: Completion Timeout: Range BC, TimeoutDis+ ARIFwd- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- ARIFwd- LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit- Address: 00000000 Data: 0000 Capabilities: [90] Subsystem: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 Capabilities: [a0] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Kernel driver in use: pcieport Thanks, Zhang Haoyu >Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html