On Tue, Oct 01, 2024 at 05:47:03PM +0200, Lukas Wunner wrote: > On Tue, Oct 01, 2024 at 11:13:17AM -0400, Gregory Price wrote: > > > > Gregory Price wrote: > > > > > Depending on the device, sometimes firmware clears the busy flag > > > > > later than expected. This can cause the device to appear busy when > > > > > calling multiple commands in quick sucession. Add a 1 second retry > > > > > window to all doe commands that end with -EBUSY. > > > > Just following up here, it sounds like everyone is unsure of this change. > > > > I can confirm that this handles the CDAT retry issue I am seeing, and that > > the BUSY bit is set upon entry into the initial call. Only 1 or 2 retries > > are attempted before it is cleared and returns successfully. > > > > I'd explored putting the retry logic in the CDAT code that calls into here, > > but that just seemed wrong. Is there a suggestion or a nak here? > > > > Trying to find a path forward. > > The PCIe Base Spec doesn't prescribe a maximum timeout for the > DOE BUSY bit to clear. Thus it seems fine to me in principle > to add a (or raise the) timeout if it turns out to be necessary > for real-life hardware. > > That said, the proposed patch has room for improvement: Will address and resubmit, thanks! > > * The patch seems to wait for DOE BUSY bit to clear *after* > completion. That's odd. The kernel waits for DOE Busy bit > to clear *before* sending a new request, in pci_doe_send_req(). > My expectation would have been that you'd add a loop there which > polls for DOE Busy bit to clear before sending a request. > > It seems that polling is the only option as no interrupt is > raised on DOE Busy bit clear, per PCIe r6.2 sec 6.30.3. > (Please add this bit of information to the commit message.) > > * The commit message should clearly specify the device(s) > affected by the issue (Vendor and Device ID plus name). > Comments such as "Depending on the device, sometimes ..." > are a little too vague. > > * The "1 or 2 retries" bit of information you're mentioning > above should likewise be in the commit message. > > * Please use "PCI/DOE:" as subject prefix to match previous > commits which touched drivers/pci/doe.c. > > * Please adhere to spec language, e.g. use "DOE Busy bit" > instead of "busy bit" so it's unambiguous for readers > what you're referring to. > > Thanks, > > Lukas