On Mon, Nov 29, 2021 at 11:32 AM Ben Widawsky <ben.widawsky@xxxxxxxxx> wrote: [..] > > > > Right, there's no harm in the check, it just seems overly paranoid to > > me if it was already checked once. Until a doorbell timeout happens > > it's an extra MMIO cycle that can saved for a "what happened?" check > > after a timeout. > > Well I suspect we're just rearranging the deck chairs on the Titanic now, but... Not so much, just trying to get this driver in line with other error handling designs. > I see doorbell timeouts as disconnected from whether or not the mailbox > interface is ready. If they were the same, we wouldn't need both bits and we > could just wait extra long for the doorbell when probing. > > In other words, I expect if the interface goes unready, doorbell timeout will > occur, but I don't think we should assume if doorbell timeout occurs, the > interface is no longer ready. I don't purport to know why a doorbell timeout > might occur while the interface remains available (likely a firmware bug, I > presume). > > It does seem interesting to check if the interface is no longer ready on timeout > though. So I'm just modeling this off of NVME error handling where there is a Controller Fatal Status bit that could be checked every transaction, but instead the driver waits until a command timeout to collect if the device went fatal / not-ready.