On Fri, Nov 10, 2023 at 07:22:31PM +0100, Michal Wajdeczko wrote: > The Single Root I/O Virtualization (SR-IOV) extension to the PCI > Express (PCIe) specification suite is supported starting from 12th > generation of Intel Graphics processors. > > This RFC aims to explain how do we want to add support for SR-IOV > to the new Xe driver and to propose related additions to the sysfs. > > Signed-off-by: Michal Wajdeczko <michal.wajdeczko@xxxxxxxxx> > Cc: Oded Gabbay <ogabbay@xxxxxxxxxx> > Cc: Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx> > Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx> > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> > Cc: Daniel Vetter <daniel@xxxxxxxx> > --- > Documentation/gpu/rfc/index.rst | 5 + > Documentation/gpu/rfc/sysfs-driver-xe-sriov | 501 ++++++++++++++++++++ > Documentation/gpu/rfc/xe_sriov.rst | 192 ++++++++ > 3 files changed, 698 insertions(+) > create mode 100644 Documentation/gpu/rfc/sysfs-driver-xe-sriov > create mode 100644 Documentation/gpu/rfc/xe_sriov.rst > > diff --git a/Documentation/gpu/rfc/index.rst b/Documentation/gpu/rfc/index.rst > index e4f7b005138d..fc5bc447f30d 100644 > --- a/Documentation/gpu/rfc/index.rst > +++ b/Documentation/gpu/rfc/index.rst > @@ -35,3 +35,8 @@ host such documentation: > .. toctree:: > > xe.rst > + > +.. toctree:: > + :maxdepth: 1 > + > + xe_sriov.rst > diff --git a/Documentation/gpu/rfc/sysfs-driver-xe-sriov b/Documentation/gpu/rfc/sysfs-driver-xe-sriov > new file mode 100644 > index 000000000000..77748204dd83 > --- /dev/null > +++ b/Documentation/gpu/rfc/sysfs-driver-xe-sriov > @@ -0,0 +1,501 @@ > +.. Documentation/ABI/testing/sysfs-driver-xe-sriov > +.. > +.. Intel Xe driver ABI (SR-IOV extensions) > +.. > + The Single Root I/O Virtualization (SR-IOV) extension to > + the PCI Express (PCIe) specification suite is supported > + starting from 12th generation of Intel Graphics processors. > + > + This document describes Xe driver specific additions. > + > + For description of generic SR-IOV sysfs attributes see > + "Documentation/ABI/testing/sysfs-bus-pci" document. > + > + /sys/bus/pci/drivers/xe/BDF/ > + ├── sriov_auto_provisioning > + │ ├── admin_mode > + │ ├── enabled > + │ ├── reset_defaults > + │ ├── resources > + │ │ ├── default_contexts_quota > + │ │ ├── default_doorbells_quota > + │ │ ├── default_ggtt_quota > + │ │ └── default_lmem_quota > + │ ├── scheduling > + │ │ ├── default_exec_quantum_ms > + │ │ └── default_preempt_timeout_us > + │ └── monitoring > + │ ├── default_cat_error_count > + │ ├── default_doorbell_time_us > + │ ├── default_engine_reset_count > + │ ├── default_h2g_time_us > + │ ├── default_irq_time_us > + │ └── default_page_fault_count > + > + /sys/bus/pci/drivers/xe/BDF/ > + ├── sriov_extensions > + │ ├── monitoring_period_ms > + │ ├── strict_scheduling_enabled > + │ ├── pf > + │ │ ├── device -> ../../../BDF > + │ │ ├── priority > + │ │ ├── tile0 > + │ │ │ ├── gt0 > + │ │ │ │ ├── exec_quantum_ms > + │ │ │ │ ├── preempt_timeout_us > + │ │ │ │ └── thresholds > + │ │ │ │ ├── cat_error_count > + │ │ │ │ ├── doorbell_time_us > + │ │ │ │ ├── engine_reset_count > + │ │ │ │ ├── h2g_time_us > + │ │ │ │ ├── irq_time_us > + │ │ │ │ └── page_fault_count > + │ │ │ └── gtX > + │ │ └── tileT > + │ ├── vf1 > + │ │ ├── device -> ../../../BDF+1 > + │ │ ├── stop > + │ │ ├── tile0 > + │ │ │ ├── ggtt_quota > + │ │ │ ├── lmem_quota > + │ │ │ ├── gt0 > + │ │ │ │ ├── contexts_quota > + │ │ │ │ ├── doorbells_quota > + │ │ │ │ ├── exec_quantum_ms > + │ │ │ │ ├── preempt_timeout_us > + │ │ │ │ └── thresholds > + │ │ │ │ ├── cat_error_count > + │ │ │ │ ├── doorbell_time_us > + │ │ │ │ ├── engine_reset_count > + │ │ │ │ ├── h2g_time_us > + │ │ │ │ ├── irq_time_us > + │ │ │ │ └── page_fault_count > + │ │ │ └── gtX > + │ │ └── tileT > + │ └── vfN > +.. > + > + > +What: /sys/bus/pci/drivers/xe/.../sriov_auto_provisioning/ > +Date: 2024 > +KernelVersion: TBD > +Contact: intel-xe@xxxxxxxxxxxxxxxxxxxxx > +Description: > + This directory appears on the device when: > + > + - device supports SR-IOV, and > + - device is a Physical Function (PF), and > + - xe driver supports SR-IOV PF on given device, and > + - xe driver supports automatic VFs provisioning. > + > + This directory is used as a root for all attributes related to > + automatic provisioning of SR-IOV Physical Function (PF) and/or > + Virtual Functions (VFs). > + > + > +What: /sys/bus/pci/drivers/xe/.../sriov_auto_provisioning/enabled > +Date: 2024 > +KernelVersion: TBD > +Contact: intel-xe@xxxxxxxxxxxxxxxxxxxxx > +Description: > + (RW) bool (0, 1) > + > + This file represents configuration flag for the automatic VFs > + (un)provisioning that could be performed by the PF. > + > + The default value is 1 (true). > + > + This flag can be set to false, unless manual provisioning is not > + applicable for given platform or it is not supported by current > + PF implementation. In such cases -EPERM will be returned. > + > + This flag will be automatically set to false when there will be > + other attempts to change any of VF's resource provisioning. > + See "sriov_extensions" section for details. > + > + This flag can be set back to true if and only if all VFs are > + fully unprovisioned, otherwise -EEXIST error will be returned. > + > + false = "disabled" > + When disabled, then PF will not attempt to do automatic > + VFs provisioning when VFs are being enabled and will not > + perform automatic unprovisioning of the VFs when VFs will > + be disabled. > + > + true = "enabled" > + When enabled, then on VFs enabling PF will do automatic > + VFs provisioning based on the default settings described > + below. > + > + If automatic VFs provisioning fails due to some reasons, > + then VFs will not be enabled. > + > + If enabled, all resources allocated during VFs enabling > + will be released during VFs disabling (automatic unprovisioning). > + > + > +What: /sys/bus/pci/drivers/xe/.../sriov_auto_provisioning/admin_mode > +Date: 2024 > +KernelVersion: TBD > +Contact: intel-xe@xxxxxxxxxxxxxxxxxxxxx > +Description: > + (RW) bool (0, 1) > + > + This file represents configuration flag for the automatic VFs > + provisioning that could be performed by the PF. > + > + The default value depends on the platform type. > + > + This flag can be changed any time, but will have no effect if > + VFs are already provisioned. > + > + If enabled (default on discrete platforms) then the PF will > + retain only minimum hardcoded resources for its own use when > + doing VFs automatic provisioning and will not use any default > + values described below for its own configuration. > + > + If disabled (default on integrated platforms) then the PF will > + treat itself like yet another additional VF in all fair resource > + allocations and will also try to apply default provisioning > + values described below for its own configuration. > + > + > +What: /sys/bus/pci/drivers/xe/.../sriov_auto_provisioning/reset_defaults > +Date: 2024 > +KernelVersion: TBD > +Contact: intel-xe@xxxxxxxxxxxxxxxxxxxxx > +Description: > + (WO) bool (1) > + > + Writing to this file will reset all default provisioning parameters > + listed below to the default values. > + > + > +What: /sys/bus/pci/drivers/xe/.../sriov_auto_provisioning/resources/default_contexts_quota > +What: /sys/bus/pci/drivers/xe/.../sriov_auto_provisioning/resources/default_doorbells_quota > +What: /sys/bus/pci/drivers/xe/.../sriov_auto_provisioning/resources/default_ggtt_quota > +What: /sys/bus/pci/drivers/xe/.../sriov_auto_provisioning/resources/default_lmem_quota > +What: /sys/bus/pci/drivers/xe/.../sriov_auto_provisioning/scheduling/default_exec_quantum_ms > +What: /sys/bus/pci/drivers/xe/.../sriov_auto_provisioning/scheduling/default_preempt_timeout_us > +What: /sys/bus/pci/drivers/xe/.../sriov_auto_provisioning/monitoring/default_cat_error_count > +What: /sys/bus/pci/drivers/xe/.../sriov_auto_provisioning/monitoring/default_doorbell_time_us > +What: /sys/bus/pci/drivers/xe/.../sriov_auto_provisioning/monitoring/default_engine_reset_count > +What: /sys/bus/pci/drivers/xe/.../sriov_auto_provisioning/monitoring/default_h2g_time_us > +What: /sys/bus/pci/drivers/xe/.../sriov_auto_provisioning/monitoring/default_irq_time_us > +What: /sys/bus/pci/drivers/xe/.../sriov_auto_provisioning/monitoring/default_page_fault_count > +Date: 2024 > +KernelVersion: TBD > +Contact: intel-xe@xxxxxxxxxxxxxxxxxxxxx > +Description: > + These files represent default provisioning that should be used > + for VFs automatic provisioning. > + > + These values can be changed any time, but will have no effect if > + VFs are already provisioned. > + > + default_contexts_quota: (RW) integer 0..U32_MAX > + The number of GuC context IDs to provide to the VF. > + The default value is 0 (use fair allocations). > + See "sriov_extensions/vfN/tileT/gtX/contexts_quota" for details. > + > + default_doorbells_quota: (RW) integer 0..U32_MAX > + The number of GuC doorbells to provide to the VF. > + The default value is 0 (use fair allocations). > + See "sriov_extensions/vfN/tileT/gtX/doorbells_quota" for details. > + > + default_ggtt_quota: (RW) integer 0..U32_MAX > + The size of the GGTT address space (in bytes) to provide to the VF. > + The default value is 0 (use fair allocations). > + See "sriov_extensions/vfN/tileT/ggtt_quota" for details. > + > + default_lmem_quota: (RW) integer 0..U32_MAX > + The size of the LMEM (in bytes) to provide to the VF. > + The default value is 0 (use fair allocations). > + See "sriov_extensions/vfN/tileT/lmem_quota" for details. > + > + default_exec_quantum_ms: (RW) integer 0..U32_MAX > + The GT execution quantum (in millisecs) assigned to the function. > + The default value is 0 (infinify). > + See "sriov_extensions/vfN/tileT/gtX/exec_quantum_ms" for details. > + > + default_preempt_timeout_us: (RW) integer 0..U32_MAX > + The GT preemption timeout (in microsecs) assigned to the function. > + The default value is 0 (infinity). > + See "sriov_extensions/vfN/tileT/gtX/preempt_timeout_us" for details. > + > + default_cat_error_count: (RW) integer 0..U32_MAX > + default_doorbell_time_us: (RW) integer 0..U32_MAX > + default_engine_reset_count: (RW) integer 0..U32_MAX > + default_h2g_time_us: (RW) integer 0..U32_MAX > + default_irq_time_us: (RW) integer 0..U32_MAX > + default_page_fault_count: (RW) integer 0..U32_MAX > + The monitoring threshold to be set for the function. > + The default value is 0 (don't monitor). > + See "sriov_extensions/vfN/tileT/gtX/thresholds" for details. > + > + > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/ > +Date: 2024 > +KernelVersion: TBD > +Contact: intel-xe@xxxxxxxxxxxxxxxxxxxxx > +Description: > + This directory appears on Xe device when: > + > + - device supports SR-IOV, and > + - device is a Physical Function (PF), and > + - driver is enabled to support SR-IOV PF on given device. > + > + This directory is used as a root for all attributes required to > + manage both Physical Function (PF) and Virtual Functions (VFs). > + > + > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/strict_scheduling_enabled > +Date: 2024 > +KernelVersion: TBD > +Contact: intel-xe@xxxxxxxxxxxxxxxxxxxxx > +Description: > + (RW) bool > + > + This file represents a flag used to determine if scheduling > + parameters should be respected even if there is no active > + workloads submitted by the PF or VFs. > + > + This flag is disabled by default, unless strict scheduling is > + not applicable on given platform. In such case this file will > + be read-only. > + > + The change to this file may have no effect if VFs are not yet enabled. > + If strict scheduling can't be enabled in GuC then write will fail with -EIO. > + > + > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/monitoring_period_ms > +Date: 2024 > +KernelVersion: TBD > +Contact: intel-xe@xxxxxxxxxxxxxxxxxxxxx > +Description: > + (RW) integer > + > + This file represents the configuration knob used by adverse event > + monitoring. A value here is the period in millisecs during which > + events are counted and the total is checked against a threshold. > + See "sriov_extensions/vfN/tileT/gtX/thresholds" for more details. > + > + Default is 0 (monitoring is disabled). > + > + If monitoring capability is not available, then attempt to enable > + will fail with -EPERM error. If monitoring can't be enabled in > + GuC then write will fail with -EIO. > + > + > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/pf/ > +Date: 2024 > +KernelVersion: TBD > +Contact: intel-xe@xxxxxxxxxxxxxxxxxxxxx > +Description: > + This directory holds all attributes related to the SR-IOV > + Physical Function (PF). > + > + > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/vfN/ > +Date: 2024 > +KernelVersion: TBD > +Contact: intel-xe@xxxxxxxxxxxxxxxxxxxxx > +Description: > + This directory holds all attributes related to the SR-IOV > + Virtual Function (VF). > + > + Note that VF numbers (N) are 1-based as described in PCI SR-IOV specification. > + The Xe driver implementaton follows that naming schema. > + > + There will be "vf1", "vf2" up to "vfN" directories, where N matches > + value of the PCI "sriov_totalvfs" attribute. > + > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/pf/tileT/ > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/vfN/tileT/ > +Date: 2024 > +KernelVersion: TBD > +Contact: intel-xe@xxxxxxxxxxxxxxxxxxxxx > +Description: > + This directory holds all SR-IOV attributes related to the device tile. > + The tile numbers (T) start from 0. > + > + There is at least one "tile0/" directory present. > + > + > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/pf/tileT/gtX > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/vfN/tileT/gtX > +Date: 2024 > +KernelVersion: TBD > +Contact: intel-xe@xxxxxxxxxxxxxxxxxxxxx > +Description: > + This directory holds all SR-IOV attributes related to the device GT. > + The GT numbers (X) start from 0. > + > + There is at least one "gt0/" directory present. > + > + > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/pf/device > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/vfN/device > +Date: 2024 > +KernelVersion: TBD > +Contact: intel-xe@xxxxxxxxxxxxxxxxxxxxx > +Description: > + (symbolic link) > + > + Backlink to the PCI device entry representing given function. > + For PF this link is always present. > + For VF this link is present only for currently enabled VFs. > + > + > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/pf/priority > +Date: 2024 > +KernelVersion: TBD > +Contact: intel-xe@xxxxxxxxxxxxxxxxxxxxx > +Description: > + (RW) string > + > + This file represents a GuC Scheduler knob to override the default > + round-robin or FIFO scheduler policies implemented by the GuC. > + > + The default value is "peer". > + > + This flag can be changed, unless such change is not applicable > + for given platform or is not supported by current GuC firmware. > + In such case this file could be read-only or will return -EPERM > + on write attempt. > + > + "immediate" > + GuC will Schedule PF workloads immediately and PF > + workloads only until the PF's work queues in GuC > + are empty. > + > + "lazy" > + GuC will Schedule PF workloads at the next opportune > + moment and PF workloads only until the PF work queues > + in GuC are empty. > + > + "peer" > + GuC Scheduler will treat PF and VFs with equal priority. > + > + > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/vfN/stop > +Date: 2024 > +KernelVersion: TBD > +Contact: intel-xe@xxxxxxxxxxxxxxxxxxxxx > +Description: > + (WO) bool (1) > + > + Write to this file will force GuC to stop handle any requests from > + this VF, but without triggering a FLR. > + To recover, the full FLR must be issued using generic "device/reset". > + > + This file allows to implement custom policy mechanism when VF is > + misbehaving and triggering adverse events above defined thresholds. > + > + > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/pf/tileT/gtX/exec_quantum_ms > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/pf/tileT/gtX/preempt_timeout_us > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/vfN/tileT/gtX/exec_quantum_ms > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/vfN/tileT/gtX/preempt_timeout_us > +Date: 2024 > +KernelVersion: TBD > +Contact: intel-xe@xxxxxxxxxxxxxxxxxxxxx > +Description: > + These files represent scheduling parameters of the functions. > + > + These scheduling parameters can be changed even if VFs are enabled > + and running, unless such change is not applicable on given platform > + due to fixed hardware or firmware assignment. > + > + exec_quantum_ms: (RW) integer 0..U32_MAX > + The GT execution quantum in [ms] assigned to the function. > + Requested quantum might be aligned per HW/FW requirements. > + > + Default is 0 (unlimited). > + > + preempt_timeout_us: (RW) integer 0..U32_MAX > + The GT preemption timeout in [us] assigned to the function. > + Requested timeout might be aligned per HW/FW requirements. > + > + Default is 0 (unlimited). > + > + > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/vfN/tileT/ggtt_quota > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/vfN/tileT/lmem_quota > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/vfN/tileT/gtX/contexts_quota > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/vfN/tileT/gtX/doorbells_quota > +Date: 2024 > +KernelVersion: TBD > +Contact: intel-xe@xxxxxxxxxxxxxxxxxxxxx > +Description: > + These files represent shared resource assigned to the functions. > + > + These resource parameters can be changed, unless VF is already running, > + or such change is not applicable on given platform due to fixed hardware > + or firmware assignment. > + > + Writes to these attributes may fail with: > + -EPERM if change is not applicable on give HW/FW. > + -E2BIG if value larger that HW/FW limit. > + -EDQUOT if value is larger than maximum quota defined by the PF. > + -ENOSPC if PF can't allocate required quota. > + -EBUSY if the resource is currently in use by the VF. > + -EIO if GuC refuses to change provisioning. > + > + ggtt_quota: (RW) integer 0..U64_MAX > + The size of the GGTT address space (in bytes) assigned to the VF. > + The value might be aligned per HW/FW requirements. > + > + Default is 0 (unprovisioned). > + > + lmem_quota: (RW) integer 0..U64_MAX > + The size of the Local Memory (in bytes) assigned to the VF. > + The value might be aligned per HW/FW requirements. > + > + This attribute is only available on discrete platforms. > + > + Default is 0 (unprovisioned). > + > + contexts_quota: (RW) 0..U16_MAX > + The number of GuC submission contexts assigned to the VF. > + This value might be aligned per HW/FW requirements. > + > + Default is 0 (unprovisioned). > + > + doorbells_quota: (RW) 0..U16_MAX > + The number of GuC doorbells assigned to the VF. > + This value might be aligned per HW/FW requirements. > + > + Default is 0 (unprovisioned). > + > + > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/pf/tileT/gtX/thresholds/cat_error_count > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/pf/tileT/gtX/thresholds/doorbell_time_us > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/pf/tileT/gtX/thresholds/engine_reset_count > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/pf/tileT/gtX/thresholds/h2g_time_us > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/pf/tileT/gtX/thresholds/irq_time_us > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/pf/tileT/gtX/thresholds/page_fault_count > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/vfN/tileT/gtX/thresholds/cat_error_count > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/vfN/tileT/gtX/thresholds/doorbell_time_us > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/vfN/tileT/gtX/thresholds/engine_reset_count > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/vfN/tileT/gtX/thresholds/h2g_time_us > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/vfN/tileT/gtX/thresholds/irq_time_us > +What: /sys/bus/pci/drivers/xe/.../sriov_extensions/vfN/tileT/gtX/thresholds/page_fault_count > +Date: 2024 > +KernelVersion: TBD > +Contact: intel-xe@xxxxxxxxxxxxxxxxxxxxx > +Description: > + These files represent threshold values used by the GuC to trigger > + security events if adverse event monitoring is enabled. > + > + These thresholds are checked every "monitoring_period_ms". > + Refer to GuC ABI for details about each threshold category. > + > + Default value for all thresholds is 0 (disabled). > + > + cat_error_count: (RW) integer > + doorbell_time_us: (RW) integer > + engine_reset_count: (RW) integer > + h2g_time_us: (RW) integer > + irq_time_us: (RW) integer > + page_fault_count: (RW) integer > diff --git a/Documentation/gpu/rfc/xe_sriov.rst b/Documentation/gpu/rfc/xe_sriov.rst > new file mode 100644 > index 000000000000..574f6414eabb > --- /dev/null > +++ b/Documentation/gpu/rfc/xe_sriov.rst > @@ -0,0 +1,192 @@ > +.. SPDX-License-Identifier: MIT > + > +======================== > +Xe – SR-IOV Support Plan > +======================== > + > +The Single Root I/O Virtualization (SR-IOV) extension to the PCI Express (PCIe) > +specification suite is supported starting from 12th generation of Intel Graphics > +processors. > + > +This document describes planned ABI of the new Xe driver (see xe.rst) that will > +provide flexible configuration and management options related to the SR-IOV. > +It will also highlight few most important changes to the Xe driver > +implementation to deal with Intel GPU SR-IOV specific requirements. > + > + > +SR-IOV Capability > +================= > + > +Due to SR-IOV complexity and required co-operation between hardware, firmware > +and kernel drivers, not all Xe architecture platforms might have SR-IOV enabled > +or fully functional. > + > +To control at the driver level which platform will provide support for SR-IOV, > +as we can't just rely on the PCI configuration data exposed by the hardware, > +we will introduce "has_sriov" flag to the struct xe_device_desc that describes > +a device capabilities that driver checks during the probe. > + > +Initially this flag will be set to disabled even on platforms that we plan to > +support. We will enable this flag only once we finish merging all required > +changes to the driver and related validated firmwares are also made available. > + > + > +SR-IOV Platforms > +================ > + > +Initially we plan to add SR-IOV functionality to the following SDV platforms > +already supported by the Xe driver: > + > + - TGL (up to 7 VFs) > + - ADL (up to 7 VFs) > + - MTL (up to 7 VFs) > + - ATSM (up to 31 VFs) > + - PVC (up to 63 VFs) > + > +Newer platforms will be supported later, but we hope that enabling will be > +much faster, as majority of the driver changes are either platform agnostic > +or are similar between earlier platforms (hence we start with SDVs). > + > + > +PF Mode > +======= > + > +Support in the driver for acting in Physical Function (PF) mode, i.e. mode > +that allows configuration of VFs, depends on the CONFIG_PCI_IOV and will be > +enabled by default. > + > +However, due to potentially conflicting requirements for SR-IOV and other mega > +features, we might want to have an option to disable SR-IOV PF mode support at > +the driver load time. What about making SR-IOV support in Xe dependent on a separate build option, such as CONFIG_DRM_XE_SRIOV? This would allow users to enable SR-IOV with CONFIG_PCI_IOV to virtualize other devices, let's say a network adapter, but to keep this feature compiled out of Xe. Francois > + > +Thus, we plan to use additional modparam named "sriov_totalvfs" which if set to > +0 will force the driver to operate in the native (non-virtualized) mode. > +The same modparam could be used to limit number of supported Virtual Functions > +(VFs) by the driver compared to the hardware limit exposed in PCI configuration. > + > +The name of this modparam corresponds to the existing PCI sysfs attribute, that > +by default exposes hardware capability. > + > +The default value of this param will allow to support all possible VFs as > +claimed by the hardware. > + > +This modparam will have no effect if driver is running on the VF device. > + > + > +VFs Enabling > +============ > + > +To enable or disable VFs we plan to rely on existing sysfs attribute exposed by > +the PCI subsystem named "sriov_numvfs". We will provide all necessary tweaks to > +provision VFs in our custom implementation of the "sriov_configure" hook from > +the struct pci_driver. > + > +If for some reason, including explicit request to disable SR-IOV PF mode using > +modparam, we will not be able to correctly support any VFs, driver will change > +number of supported VFs, exposed to the userspace by "sriov_totalvfs" attribute, > +to 0, thus preventing configuration of the VFs. > + > + > +VF Mode > +======= > + > +When driver is running on the VF device, then due to hardware enforcements, > +access to the privileged registers is not possible. To avoid relying on these > +registers, we plan to perform early detection if we are running on the VF > +device using dedicated VF_CAP(0x1901f8) register and then use global macro > +IS_SRIOV_VF(xe) to control the driver logic. > + > +To speed up merging of the required changes, we might first introduce dummy > +macro that is always set to false, to prepare driver to avoid some code paths > +before we finalize our VF mode detection and other VFs enabling changes. > + > + > +Resources > +========= > + > +Most of the hardware (or firmware) resources available on the Xe architecture, > +like GGTT, LMEM, GuC context IDs, GuC doorbells, will be shared between PF and > +VFs and will require some provisioning steps to assign those resources for use > +by the VF. > + > +Until VFs are provisioned with resources, the PF driver will be able to use all > +resources, in the same way as it would be running in non-virtualized mode. > + > +If some resource (of part or region of it) is assigned to specific VF, then PF > +is not allowed to use that part or region of the resource, but can continue to > +use whatever is left available. > + > +Those resources are usually fully virtualized, so they will not require any > +special handling when used by the VF driver, except that VF driver must know > +the assigned quota. > + > +The most notable exception is the GGTT address space, as on some platforms, > +the VF driver must additionally know the real range that it can access. > + > +Once the resources were assigned to the VF use and the VF driver has started, > +then it is not allowed to change such provisioning, as that would break the > +VF driver. To make changes the VF driver, which was using these resources, > +must be unloaded (or the VM is terminated) and the VF device must be reset > +using the FLR. > + > + > +Scheduling > +========== > + > +The workloads from PF driver and VF drivers must be submitted to the hardware > +always by using the GuC submission mechanism. Unless VF has exclusive access > +to the GT then submissions from different VFs are time-sliced and controlled > +with additional "execution_quantum" and "preemption_timeout" parameters. > + > +In contrast to the resource provisioning, those scheduling parameters can be > +changed even if VF drivers are already running and are active. > + > + > +Automatic VFs Provisioning > +========================== > + > +To provide out-of-the box experience when user will be enabling VFs using > +generic "sriov_numvfs" attribute without requiring complex provisioning steps, > +the SR-IOV PF driver will implement automatic VFs resource provisioning. > + > +By default, all VFs will be allocated with the fair amount of the mandatory > +resources (like GGTT, GuC IDs) and with unrestricted scheduling parameters. > +Such provisioning should be sufficient for most of the normal usages, when > +no strict SLA is required. > + > +The PF driver will also expose some additional sysfs files to allow adjusting > +this automatic VFs provisioning, like default values for most of the > +provisioning parameters that PF will then apply for each enabled VF. > + > + Details about those extension can be found in > + :download:`Preliminary Xe driver ABI <sysfs-driver-xe-sriov>`. > + > + > +Manual VFs Provisioning > +======================= > + > +If automatic VFs provisioning, which applies same configuration to every VF, > +is not sufficient or there is a need for advanced customization of some VF, > +the PF driver will also provide extended sysfs interface which will allow > +control every provisioning attribute to the lowest feasible level. > + > +It is expected that these low-level attributes will be mostly used by the > +advanced users or by the custom tools that will setup configurations that > +meet predefined and validated SLA as required by the customers. > + > + Details about those extension can be found in > + :download:`Preliminary Xe driver ABI <sysfs-driver-xe-sriov>`. > + > + > +VFs Monitoring > +============== > + > +In addition to the resource provisioning or changing scheduling parameters, > +the PF driver might also allow configure some monitoring parameters, like > +thresholds of adverse events or sample period, to track undesired behavior > +of the VFs that could impact the whole system. > + > +Once those thresholds are setup and sampling period is defined, the GuC will > +notify the PF driver about which VF is excessing the threshold and then PF is > +able to trigger the uevent to notify the administrator (or VMM) that could > +take some action against the VF. > -- > 2.25.1 >