On Wed, 2014-10-29 at 18:57 +0100, Andreas Hartmann wrote: > Alex Williamson schrieb: > > On Wed, 2014-10-29 at 17:47 +0100, Andreas Hartmann wrote: > >> Alex Williamson wrote: > >>> On Sat, 2014-10-25 at 08:03 +0200, Andreas Hartmann wrote: > >>>> > >>>> Out of interest: > >>>> Bjorn's patch disables vc save/restore support - and the machine works > >>>> fine again. Why is it needed at all if it seems to work perfectly w/o > >>>> it? What's the additional benefit? Or in other words: What am I missing > >>>> until today :-) ? What would be better? What could I do more? > >>> > >>> > >>> You're right, in the configuration you have the endpoint device has a > >>> Virtual Channel capability but the upstream root port does not. The > >>> spec is not at all clear about defining the endpoints for enabling > >>> Virtual Channel in each type of configuration, but I think that if we > >>> have an upstream port that does not support Virtual Channel, we can skip > >>> the save/restore. Please test the patch below. > >>> > >>> I'm also still completely confused about whether this is a VC > >>> save/restore issue or a bus reset issue. You originally bisected this > >>> back to the VC save/restore patch, but you also found that a manual, > >>> setpci-based bus reset triggered a system hang. > >> > >> With your additional patch posted here: > >> http://article.gmane.org/gmane.linux.kernel.pci/36162 > > > > Right, a reset via sysfs also triggered it with that patch, but the > > reset via setpci is independent of any VC save/restore and still hung > > your box. > > > >> > >>> I believe that > >>> re-ordering the kernel reset mechanisms also triggered this. Since > >>> recent versions of QEMU are going to favor a bus reset over PM reset, I > >>> don't have a lot of confidence that we're actually solving the problem > >>> for you. Please make sure to test with a recent QEMU to be sure we'll > >>> do a bus reset. > >> > >> I'm running qemu 2.1.0 (newest is 2.1.2 - but this shouldn't be a > >> problem) and tested w/ linux 3.17. > > > > Yep, just want to make sure it's QEMU new enough to do a bus reset and > > kernel with matching support. > > > >>> diff --git a/drivers/pci/vc.c b/drivers/pci/vc.c > >>> index 7e1304d..6d13d34 100644 > >>> --- a/drivers/pci/vc.c > >>> +++ b/drivers/pci/vc.c > >>> @@ -339,6 +339,25 @@ static int pci_vc_do_save_buffer(struct pci_dev *dev, int pos, > >>> return buf ? 0 : len; > >>> } > >>> > >>> +/** > >>> + * pci_vc_needs_save - Determine whether a VC capability needs to be saved > >>> + * @dev: device > >>> + * @id: VC capability ID (VC/VC9/MFVC) > >>> + * > >>> + * In configurations where we have a VC or MFVC capability, but the upstream > >>> + * device does not, we assume that VC save (and therefore restore) is not > >>> + * necessary. The intention is to only do VC save/restore in configuration > >>> + * where it's necessary and hopefully avoid reset issues. > >>> + */ > >>> +static bool pci_vc_needs_save(struct pci_dev *dev, u16 id) > >>> +{ > >>> + if (id == PCI_EXT_CAP_ID_VC9 || pci_is_root_bus(dev->bus) || > >>> + pci_find_ext_capability(dev->bus->self, PCI_EXT_CAP_ID_VC)) > >>> + return true; > >>> + > >>> + return false; > >>> +} > >>> + > >>> static struct { > >>> u16 id; > >>> const char *name; > >>> @@ -362,7 +381,7 @@ int pci_save_vc_state(struct pci_dev *dev) > >>> struct pci_cap_saved_state *save_state; > >>> > >>> pos = pci_find_ext_capability(dev, vc_caps[i].id); > >>> - if (!pos) > >>> + if (!posi || !pci_vc_needs_save(dev, vc_caps[i].id)) > >> ^ > >> This should be most probably !pos (and not !posi - because !posi does > >> through a compile error). > > > > Oops, sorry. > > > >>> continue; > >>> > >>> save_state = pci_find_saved_ext_cap(dev, vc_caps[i].id); > >>> @@ -422,7 +441,7 @@ void pci_allocate_vc_save_buffers(struct pci_dev *dev) > >>> for (i = 0; i < ARRAY_SIZE(vc_caps); i++) { > >>> int len, pos = pci_find_ext_capability(dev, vc_caps[i].id); > >>> > >>> - if (!pos) > >>> + if (!pos || !pci_vc_needs_save(dev, vc_caps[i].id)) > >>> continue; > >>> > >>> len = pci_vc_do_save_buffer(dev, pos, NULL, false); > >> > >> W/ the above patch, the machine hangs again (w/ qemu and setpci), but w/ > >> Bjorn's patch (and nothing more applied) which disables vc save/restore, > >> the machine just works fine ... . I especially retested this case to be > >> really sure. I'm so sorry. But that's how it behaves here :-( > > > > Hmm, the intention was that this should effectively do the same thing as > > Bjorn's patch. The Atheros device (03:00.0) reports a VC capability but > > the root port above it (00:05.0) does not. > > Are you sure, that this patch really works (-> here!) as expected? Would > it be possible to add some debug output printing to the actual console > (not to log file) to be sure it really works as expected? Maybe some > more output to get an idea what's actually going on? Or is it just a > timing issue? Sure, here's some added printks (and fixed posi). You should be able to run 'dmesg | grep pci_vc_needs_save' after boot and see device 0000:03:00.0. Hopefully you won't see the pci_save_vc_state() printk as you assign the device. diff --git a/drivers/pci/vc.c b/drivers/pci/vc.c index 7e1304d..300e126 100644 --- a/drivers/pci/vc.c +++ b/drivers/pci/vc.c @@ -339,6 +339,26 @@ static int pci_vc_do_save_buffer(struct pci_dev *dev, int pos, return buf ? 0 : len; } +/** + * pci_vc_needs_save - Determine whether a VC capability needs to be saved + * @dev: device + * @id: VC capability ID (VC/VC9/MFVC) + * + * In configurations where we have a VC or MFVC capability, but the upstream + * device does not, we assume that VC save (and therefore restore) is not + * necessary. The intention is to only do VC save/restore in configuration + * where it's necessary and hopefully avoid reset issues. + */ +static bool pci_vc_needs_save(struct pci_dev *dev, u16 id) +{ + if (id == PCI_EXT_CAP_ID_VC9 || pci_is_root_bus(dev->bus) || + pci_find_ext_capability(dev->bus->self, PCI_EXT_CAP_ID_VC)) + return true; + + printk("%s(%s, %x) returning false\n", __func__, pci_name(dev), id); + return false; +} + static struct { u16 id; const char *name; @@ -362,7 +382,7 @@ int pci_save_vc_state(struct pci_dev *dev) struct pci_cap_saved_state *save_state; pos = pci_find_ext_capability(dev, vc_caps[i].id); - if (!pos) + if (!pos || !pci_vc_needs_save(dev, vc_caps[i].id)) continue; save_state = pci_find_saved_ext_cap(dev, vc_caps[i].id); @@ -372,6 +392,7 @@ int pci_save_vc_state(struct pci_dev *dev) return -ENOMEM; } + I printk("%s doing %s save on %s\n", __func__, vc_caps[i].name, pci_name(dev)); ret = pci_vc_do_save_buffer(dev, pos, save_state, true); if (ret) { dev_err(&dev->dev, "%s save unsuccessful %s\n", @@ -403,6 +424,7 @@ void pci_restore_vc_state(struct pci_dev *dev) if (!save_state || !pos) continue; + I printk("%s doing %s restore on %s\n", __func__, vc_caps[i].name, pci_name(dev)); pci_vc_do_save_buffer(dev, pos, save_state, false); } } @@ -422,7 +444,7 @@ void pci_allocate_vc_save_buffers(struct pci_dev *dev) for (i = 0; i < ARRAY_SIZE(vc_caps); i++) { int len, pos = pci_find_ext_capability(dev, vc_caps[i].id); - if (!pos) + if (!pos || !pci_vc_needs_save(dev, vc_caps[i].id)) continue; len = pci_vc_do_save_buffer(dev, pos, NULL, false); -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html