On Thu, Jul 16, 2020 at 05:06:19PM +0800, Yicong Yang wrote: > On 2020/7/11 7:09, Bjorn Helgaas wrote: > > On Sat, Jun 13, 2020 at 05:32:13PM +0800, Yicong Yang wrote: > >> HiSilicon PCIe tune and trace device(PTT) is a PCIe Root Complex > >> integrated Endpoint(RCiEP) device, providing the capability > >> to dynamically monitor and tune the PCIe traffic parameters(tune), > >> and trace the TLP headers to the memory(trace). > >> > >> Add the driver for the device to enable its functions. The driver > >> will create debugfs directory for each PTT device, and users can > >> operate the device through the files under its directory. > >> +Tune > >> +==== > >> + > >> +PTT tune is designed for monitoring and adjusting PCIe link parameters(events). > >> +Currently we support events 4 classes. The scope of the events > >> +covers the PCIe core with which the PTT device belongs to. > > All of these look like things that have the potential to break the > > PCIe protocol and cause errors like timeouts, receiver overflows, etc. > > That's OK for a debug/analysis situation, but it should taint the > > kernel somehow because I don't want to debug problems like that if > > they're caused by users tweaking things. > > > > That might even be a reason *not* to merge the tune side of this. I > > can see how it might be useful for you internally, but it's not > > obvious to me how it will benefit other users. Maybe that part should > > be an out-of-tree module? > > All the tuning values are not accurate, but abstracted to several > _levels_ of each events. The levels are delicately designed to > guarantee by the hardware that they are always valid and will not > break the PCIe link. The possible level values exposed to the users > is tested and safe and other values will not be accepted. > > The final tuning events is not settled and we'll not exposed the > events which will may lead to the link broken. Furthermore, maybe we > could default disable the tune events' level adjustment and make > them readonly. The user can enable the full tune function by a > module parameters or in the BIOS, and a warning message will be > displayed. > > The tune part is beneficial for the users and not only for our > internal use. We intends to provide a way to tune the link > depending on the downstream components and link configuration. For > example, users can tune the data path QoS level to get better > performance according to the link width is x8 or x16, or according > to the endpoints' class is a network card or a nvme disk. It will > make our controller adapt to different condition with high > performance, so we hope this feature to be merged. OK. This driver itself is outside my area, so I guess merging it is up to Alexander. Do you have any measurements of performance improvements? I think it would be good to have real numbers showing that this is useful. You mentioned a warning message, so I assume you'll add some kind of dmesg logging when tuning happens? Is this protected so it's only usable by root or other appropriate privileged user? > >> + * The PTT can designate function for trace. > >> + * Add the root port's subordinates in the list as we > >> + * can specify certain function. > >> + */ > >> + child_bus = tpdev->subordinate; > >> + list_for_each_entry(tpdev, &child_bus->devices, bus_list) { > > *This* looks like a potential problem with hotplug. How do you deal > > with devices being added/removed after this loop? > > Yes. I have considered the add/remove situation but not intend to address it > in this RFC and assume the topology is static after probing. > I will manage the situation in next version. What happens if a device is added or removed after boot? If the only limitation is that you can't tune or trace a hot-added device, that's fine. (I mean, it's really *not* fine because it's a poor user experience, but at least it's just a usability issue, not a crash.) But if hot-adding or hot-removing a device can cause an oops or a crash or something, *that* is definitely a problem. Bjorn