On Sat, Aug 07, 2021 at 03:01:56PM +0800, Dongdong Liu wrote: > > On 2021/8/5 23:31, Bjorn Helgaas wrote: > > On Thu, Aug 05, 2021 at 04:37:39PM +0800, Dongdong Liu wrote: > > > > > > > > > On 2021/8/5 7:49, Bjorn Helgaas wrote: > > > > On Wed, Aug 04, 2021 at 09:47:06PM +0800, Dongdong Liu wrote: > > > > > PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports > > > > > sending Requests to other Endpoints (as opposed to host memory), the > > > > > Endpoint must not send 10-Bit Tag Requests to another given Endpoint > > > > > unless an implementation-specific mechanism determines that the Endpoint > > > > > supports 10-Bit Tag Completer capability. Add a 10bit_tag sysfs file, > > > > > write 0 to disable 10-Bit Tag Requester when the driver does not bind > > > > > the device if the peer device does not support the 10-Bit Tag Completer. > > > > > This will make P2P traffic safe. the 10bit_tag file content indicate > > > > > current 10-Bit Tag Requester Enable status. > > > > > > > > > > Signed-off-by: Dongdong Liu <liudongdong3@xxxxxxxxxx> > > > > > --- > > > > > Documentation/ABI/testing/sysfs-bus-pci | 16 +++++++- > > > > > drivers/pci/pci-sysfs.c | 69 +++++++++++++++++++++++++++++++++ > > > > > 2 files changed, 84 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci > > > > > index 793cbb7..0e0c97d 100644 > > > > > --- a/Documentation/ABI/testing/sysfs-bus-pci > > > > > +++ b/Documentation/ABI/testing/sysfs-bus-pci > > > > > @@ -139,7 +139,7 @@ Description: > > > > > binary file containing the Vital Product Data for the > > > > > device. It should follow the VPD format defined in > > > > > PCI Specification 2.1 or 2.2, but users should consider > > > > > - that some devices may have incorrectly formatted data. > > > > > + that some devices may have incorrectly formatted data. > > > > > If the underlying VPD has a writable section then the > > > > > corresponding section of this file will be writable. > > > > > > > > > > @@ -407,3 +407,17 @@ Description: > > > > > > > > > > The file is writable if the PF is bound to a driver that > > > > > implements ->sriov_set_msix_vec_count(). > > > > > + > > > > > +What: /sys/bus/pci/devices/.../10bit_tag > > > > > +Date: August 2021 > > > > > +Contact: Dongdong Liu <liudongdong3@xxxxxxxxxx> > > > > > +Description: > > > > > + If a PCI device support 10-Bit Tag Requester, will create the > > > > > + 10bit_tag sysfs file. The file is readable, the value > > > > > + indicate current 10-Bit Tag Requester Enable. > > > > > + 1 - enabled, 0 - disabled. > > > > > + > > > > > + The file is also writeable, the value only accept by write 0 > > > > > + to disable 10-Bit Tag Requester when the driver does not bind > > > > > + the deivce. The typical use case is for p2pdma when the peer > > > > > + device does not support 10-BIT Tag Completer. > > > > > > > +static ssize_t pci_10bit_tag_store(struct device *dev, > > > > > + struct device_attribute *attr, > > > > > + const char *buf, size_t count) > > > > > +{ > > > > > + struct pci_dev *pdev = to_pci_dev(dev); > > > > > + bool enable; > > > > > + > > > > > + if (kstrtobool(buf, &enable) < 0) > > > > > + return -EINVAL; > > > > > + > > > > > + if (enable != false ) > > > > > + return -EINVAL; > > > > > > > > Is this the same as "if (enable)"? > > > Yes, Will fix. > > > > I actually don't like the one-way nature of this. When the hierarchy > > supports 10-bit tags, we automatically enable them during enumeration. > > > > Then we provide this sysfs file, but it can only *disable* 10-bit > > tags. There's no way to re-enable them except by rebooting (or using > > setpci, I guess). > > > > Why can't we allow *enabling* them here if they're supported in this > > hierarchy? > Yes, we can also provide this sysfs to enable 10-bit tag for EP devices > when the hierarchy supports 10-bit tags. > > I do not want to provide sysfs to enable/disable 10-bit tag for RP > devices as I can not tell current if the the Function has outstanding > Non-Posted Requests, may need to unbind all the EP drivers under the > RP, and current seems no scenario need to do this. This will make things > more complex. You mean "no scenario requires disabling 10-bit tags and then re-enabling them"? That may be true, but I'm still hesitant to provide a switch than can only be reversed by rebooting. This is similar to the issue Leon raised that it's not practical to reboot machines. Maybe we accept a one-way switch if the sole purpose is to work around a hardware defect. Or maybe a kernel parameter that disables 10-bit tags completely is the defect mitigation. I think we probably need such a parameter in case a defect prevents us from booting far enough to use the sysfs switch.