> -----Original Message----- > From: linux-usb-owner@xxxxxxxxxxxxxxx [mailto:linux-usb- > owner@xxxxxxxxxxxxxxx] On Behalf Of Sarah Sharp > Sent: Saturday, March 17, 2012 4:09 AM > To: Elric Fu > Cc: Xu, Andiry; linux-usb@xxxxxxxxxxxxxxx > Subject: Re: [PATCH 2/2] xHCI: fix bug of resume for VIA xHCI host > > On Thu, Mar 15, 2012 at 05:19:15PM +0800, Elric Fu wrote: > > 2012/3/15 Xu, Andiry <Andiry.Xu@xxxxxxx>: > > >> 2012/3/15 Xu, Andiry <Andiry.Xu@xxxxxxx>: > > >> >> The Save/Restore state funtion of VIA xHCI host is unstable. So add > > >> >> the XHCI_RESET_ON_RESUME quirk for it. > > >> >> > > >> > > > >> > Can you provide more background of the issue? What's the unstable > > >> > behavior of VIA host after resume? > > >> > > >> The issue of using the Save/Restore state operation will cause the > > >> device attached to the port of host is disconnected. > > >> > > > > Sorry. My description about the issue is incorrect. If using the > save/restore > > state function, after resume, the superspeed devices work fine but the > usb > > 2.0 devices will be disconnected. It seems like the embedded 2.0 hub in > > VIA host controller can't be re-enumerated, so the 2.0 devices will be > > disconnected. I don't know why. > > > > > > > > Hmm. I think xHCI suspend path may need to be save DCBAA and all > related > > > data structures before remove core well power. It's specified by xHCI > > > 1.0 errata. > > > > Well. Saving the DCBAA, Context and other data structures maybe fix this > > kind of issue. > > But the xHCI driver does save the DCBAA. It's in the xhci_hcd > structure. We restore the pointer to the dcbaa in > xhci_restore_registers(), which is called after a resume from suspend: > > static void xhci_restore_registers(struct xhci_hcd *xhci) > { > xhci_writel(xhci, xhci->s3.command, &xhci->op_regs->command); > xhci_writel(xhci, xhci->s3.dev_nt, &xhci->op_regs- > >dev_notification); > xhci_write_64(xhci, xhci->s3.dcbaa_ptr, &xhci->op_regs->dcbaa_ptr); > ... > > We don't restore the register state after a hibernate, and I've been > wondering why. Andiry, I think you (or one of the AMD team) wrote the > suspend/resume code. Do you remember why we unconditionally do not > restore the registers on resume from hibernate? > In my original design, xhci_resume() does not perform complete cleanup/initialization in hibernation. It just reset the xHC, in that case we need to restore the registers. I got information that Windows USB3.0 driver is also doing this way. However, you pointed out that we need a complete re-initialization, so I removed the registers restore part, because all the related registers will be re-written during xhci_init(). > int xhci_resume(struct xhci_hcd *xhci, bool hibernated) > { > ... > if (!hibernated) { > /* restore register code */ > } > /* If restore operation fails, re-initialize the HC during resume > */ > if ((temp & STS_SRE) || hibernated) { > } > > I know that power is completely removed during hibernate, so maybe we > just can't restore the registers? > > Now, I did look at the xHCI 1.0 spec errata from last June, and I think > we're saving and restoring the register state in the wrong order. > Patches to follow shortly. > > > >> >> When driver free Event Ring in the resume procedure, driver can't > > >> >> write zero to Event Ring Registers in VIA xHCI host, otherwise the > > >> >> VIA host can't be resumed. > > >> >> > > >> > > > >> > Strange. What do you mean by "can't resume"? Does xhci_resume() > return > > >> > failure, or it returns success but host does not work after resume? > > >> > > >> The issue is the host will hang up. The reason is that the host has a > > >> hardware state machine, in the reset-on-resume procedure driver will > > >> reset the host first then free event ring and re-initialize the host. > But > > >> if driver set 0 to event ring registers in xhci_mem_cleanup() after > reset > > >> the host, the host will think the "0" is a vaild value, and fetch it. > it > > >> result > > >> in the host hang up. > > >> > > > > > > xhci_halt() is called before xhci_reset(). Why does the host fetch > event ring > > > registers after it's halted? > > > > I don't know. As far as I know, after reset the VIA host, the host > > will fetch the > > value of event ring register as long as the driver write those registers. > Then > > after driver write 0 to those registers the host thinks the value is > > valid, maybe > > do something to event ring after fetching the value. I don't know the > more > > details about it. > > > > >> > If the host needs a reset-on-resume, it will perform a complete > > >> > re-initialization. After xhci_mem_cleanup() called, xhci_resume() > will > > >> > call xhci_init(), in which the ERST registers are re-written. Have > you > > >> > checked the value, see if it's different form before suspend? > > >> > > >> I don't think in reset-on-resume write 0 to event Ring registers is > > >> necessary. > > >> As driver already reset the host before xhci_mem_cleanup(), the value > of > > >> event ring registers will be 0. > > As Eric pointed out, the solution seems to just be to not write zeroes > to the DCBAA on memory clean up. After halt and reset, the xHCI host > DCBAA entry will be reset to its default (zero), so there's no reason to > write it again. If the host *isn't* halted because the stop command > write timed out, we really don't want to write anything to the DCBAA. > > > >> > Also, note xhci_mem_cleanup is called not only by xhci_resume(), but > also > > >> > by xhci_stop(). This means you may find the same issue by > rmmod/modprobe > > >> > xhci-hcd. > > >> > > >> rmmod/modprobe xhci-hcd is ok. I have already tested it. I think it is > > >> ok if we write > > >> a valid value except for 0 to event ring registers in VIA xHCI host > > >> after reset the > > >> host. > > That's very strange. The shutdown code in the failed restore case is > nearly identical to the code in xhci_stop() that's called in the > shutdown case. The only difference is that xhci_halt() is called twice > in the shutdown case, so maybe the VIA host is not actually halted on > the failed register restore path? Maybe we need to increase > XHCI_MAX_HALT_USEC? > > Actually, the shutdown and hibernate/failed register restore paths > functions desperately need to be refactored to share code. It's > basically (but not quite) copy-paste between the two functions. For > instance, there's a call to usb_amd_dev_put() in xhci_stop(), but not in > the failed register restore/hibernate case. > > Andiry, is that the PLL shutdown behavior you wanted? > Yes. Because the match of usb_amd_dev_put() is in xhci_pci_setup(), not in xhci_init(). Thanks, Andiry > Eric, I'm going to send you a couple of patches, and I want you to apply > them one at a time and try suspending and resuming (hopefully with > netconsole or something so you can capture log messages on resume). > > The first patch adds a warning to xhci_halt in case the host doesn't > actually halt. We should see if your failed resume is due to the host > not actually being halted when DCBAA is written. > > The second removes the writes of zero to the DCBAA and command ring > pointers in xhci_mem_cleanup(), which should remove the hang all > together. > > The third increases the xHCI halt timeout, and should only be applied if > you find the host isn't halting in time. > > The fourth changes the order of the register saves and restores to match > the xHCI 1.0 errata. That may make your host indicate the register > restore actually worked, which will mean the xHCI driver doesn't need to > re-initialize the host at all. > > Sarah Sharp > -- > To unsubscribe from this list: send the line "unsubscribe linux-usb" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html