Em Thu, 6 Dec 2018 17:07:52 -0200 Mauro Carvalho Chehab <mchehab@xxxxxxxxxx> escreveu: > Em Thu, 6 Dec 2018 13:36:24 -0500 > Alex Deucher <alexdeucher@xxxxxxxxx> escreveu: > > > On Thu, Dec 6, 2018 at 1:05 PM Mauro Carvalho Chehab <mchehab@xxxxxxxxxx> wrote: > > > > > > Em Thu, 06 Dec 2018 18:18:23 +0100 > > > Markus Dobel <markus.dobel@xxxxxx> escreveu: > > > > > > > Hi everyone, > > > > > > > > I will try if the hack mentioned fixes the issue for me on the weekend (but I assume, as if effectively removes the function). > > > > > > It should, but it keeps a few changes. Just want to be sure that what > > > would be left won't cause issues. If this works, the logic that would > > > solve Ryzen DMA fixes will be contained into a single point, making > > > easier to maintain it. > > > > > > > > > > > Just in case this is of interest, I neither have Ryzen nor Intel, but an HP Microserver G7 with an AMD Turion II Neo N54L, so the machine is more on the slow side. > > > > > > Good to know. It would probably worth to check if this Ryzen > > > bug occors with all versions of it or with just a subset. > > > I mean: maybe it is only at the first gen or Ryzen and doesn't > > > affect Ryzen 2 (or vice versa). > > > > The original commit also mentions some Xeons are affected too. Seems > > like this is potentially an issue on the device side rather than the > > platform. > > Maybe. > > > > > > > The PCI quirks logic will likely need to detect the PCI ID of > > > the memory controllers found at the buggy CPUs, in order to enable > > > the quirk only for the affected ones. > > > > > > It could be worth talking with AMD people in order to be sure about > > > the differences at the DMA engine side. > > > > > > > It's not clear to me what the pci or platform quirk would do. The > > workaround seems to be in the driver, not on the platform. > > Yeah, the fix should be at the driver, but pci/quirk.c would be able > to detect memory controllers that would require a hack inside the > driver, in a similar way to what the media PCI drivers already do > for DMA controllers that don't support pci2pci transfers. > > There, basically the PCI core (drivers/pci/pci.c and > drivers/pci/quirks.c) sets a flag (pci_pci_problems) indicating > potential issues. > > Then, the driver compares such flag in order to enable the specific quirk. > > Ok, there would be a different way to handle it. The driver could use a > logic similar to the one I wrote for drivers/edac/i7core_edac.c. There, > the logic seeks for some specific PCI device IDs using pci_get_device(), > adjusting the code accordingly, depending on the detected PCI devices. > > In other words, running something like this (untested), at probe time should > produce similar results: > > /* > * FIXME: It probably makes sense to also be able to identify specific > * versions of the same PCI ID, just in case a latter stepping got a > * fix for the issue. > */ > const static struct { > int vendor, dev; > } broken_dev_id[] = { > PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_foo, > PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_bar, > }, > > bool cx23885_does_dma_require_reset(void) > { > int i; > struct pci_dev *pdev = NULL; > > for (i = 0; i < sizeof(broken_dev_id); i++) { > pdev = pci_get_device(broken_dev_id[i].vendor, broken_dev_id[i].dev, NULL); > if (pdev) { > pci_put_device(pdev); > return true; > } > } > return false; > } > > Should work. In any case, we need to know what memory controllers > have problems, and what are their PCI IDs, and add them (if not there yet) > at include/linux/pci_ids.h > > Thanks, > Mauro To be clearer, I'm thinking on something like the (untested) code below (untested). PS.: the PCI ID used there may be wrong. I just added one in order to have a proof of concept. Thanks, Mauro [PATCH] media: cx23885: only reset DMA on problematic CPUs It is reported that changeset 95f408bbc4e4 ("media: cx23885: Ryzen DMA related RiSC engine stall fixes") caused regresssions with other CPUs. Ensure that the quirk will be applied only for the CPUs that are known to cause problems. Fixes: 95f408bbc4e4 ("media: cx23885: Ryzen DMA related RiSC engine stall fixes") Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@xxxxxxxxxx> diff --git a/drivers/media/pci/cx23885/cx23885-core.c b/drivers/media/pci/cx23885/cx23885-core.c index 39804d830305..48da7d194cc1 100644 --- a/drivers/media/pci/cx23885/cx23885-core.c +++ b/drivers/media/pci/cx23885/cx23885-core.c @@ -23,6 +23,7 @@ #include <linux/moduleparam.h> #include <linux/kmod.h> #include <linux/kernel.h> +#include <linux/pci.h> #include <linux/slab.h> #include <linux/interrupt.h> #include <linux/delay.h> @@ -603,8 +604,13 @@ static void cx23885_risc_disasm(struct cx23885_tsport *port, static void cx23885_clear_bridge_error(struct cx23885_dev *dev) { - uint32_t reg1_val = cx_read(TC_REQ); /* read-only */ - uint32_t reg2_val = cx_read(TC_REQ_SET); + uint32_t reg1_val, reg2_val; + + if (!dev->need_dma_reset) + return; + + reg1_val = cx_read(TC_REQ); /* read-only */ + reg2_val = cx_read(TC_REQ_SET); if (reg1_val && reg2_val) { cx_write(TC_REQ, reg1_val); @@ -2058,6 +2064,31 @@ void cx23885_gpio_enable(struct cx23885_dev *dev, u32 mask, int asoutput) /* TODO: 23-19 */ } +static struct { + int vendor, dev; +} const broken_dev_id[] = { + /* According with + * https://openbenchmarking.org/system/1703021-RI-AMDZEN08075/Ryzen%207%201800X/lspci, + * 0x1451 is PCI ID for the IOMMU found on Ryzen 7 + */ + { PCI_VENDOR_ID_AMD, 0x1451 }, +}; + +static bool cx23885_does_need_dma_reset(void) +{ + int i; + struct pci_dev *pdev = NULL; + + for (i = 0; i < sizeof(broken_dev_id); i++) { + pdev = pci_get_device(broken_dev_id[i].vendor, broken_dev_id[i].dev, NULL); + if (pdev) { + pci_dev_put(pdev); + return true; + } + } + return false; +} + static int cx23885_initdev(struct pci_dev *pci_dev, const struct pci_device_id *pci_id) { @@ -2069,6 +2100,8 @@ static int cx23885_initdev(struct pci_dev *pci_dev, if (NULL == dev) return -ENOMEM; + dev->need_dma_reset = cx23885_does_need_dma_reset(); + err = v4l2_device_register(&pci_dev->dev, &dev->v4l2_dev); if (err < 0) goto fail_free; diff --git a/drivers/media/pci/cx23885/cx23885.h b/drivers/media/pci/cx23885/cx23885.h index d54c7ee1ab21..cf965efabe66 100644 --- a/drivers/media/pci/cx23885/cx23885.h +++ b/drivers/media/pci/cx23885/cx23885.h @@ -451,6 +451,8 @@ struct cx23885_dev { /* Analog raw audio */ struct cx23885_audio_dev *audio_dev; + /* Does the system require periodic DMA resets? */ + unsigned int need_dma_reset:1; }; static inline struct cx23885_dev *to_cx23885(struct v4l2_device *v4l2_dev)