On Mon, 15 Jun 2020 at 10:09, Krzysztof Kozlowski <krzk@xxxxxxxxxx> wrote: > > On Sun, Jun 14, 2020 at 06:34:33PM +0300, Vladimir Oltean wrote: > > On Sun, 14 Jun 2020 at 18:12, Krzysztof Kozlowski <krzk@xxxxxxxxxx> wrote: > > > > > > On Sun, Jun 14, 2020 at 04:43:28PM +0300, Vladimir Oltean wrote: > > > > On Sun, 14 Jun 2020 at 16:39, Vladimir Oltean <olteanv@xxxxxxxxx> wrote: > > > > > > > > > > On Sun, 14 Jun 2020 at 14:18, Krzysztof Kozlowski <krzk@xxxxxxxxxx> wrote: > > > > > > > > > > > > On Sun, Jun 14, 2020 at 02:14:15PM +0300, Vladimir Oltean wrote: > > > > > > > On Sun, 14 Jun 2020 at 13:56, Krzysztof Kozlowski <krzk@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > If interrupt fires early, the dspi_interrupt() could complete > > > > > > > > (dspi->xfer_done) before its initialization happens. > > > > > > > > > > > > > > > > Fixes: 4f5ee75ea171 ("spi: spi-fsl-dspi: Replace interruptible wait queue with a simple completion") > > > > > > > > Also please note that this patch merely replaced an > > > > init_waitqueue_head with init_completion. But the "bug" (if we can > > > > call it that) originates from even before. > > > > > > Yeah, I know, the Fixes is not accurate. Backport to earlier kernels > > > would be manual so I am not sure if accurate Fixes matter. > > > > > > > > > > > > > > > Cc: <stable@xxxxxxxxxxxxxxx> > > > > > > > > Signed-off-by: Krzysztof Kozlowski <krzk@xxxxxxxxxx> > > > > > > > > --- > > > > > > > > > > > > > > Why would an interrupt fire before spi_register_controller, therefore > > > > > > > before dspi_transfer_one_message could get called? > > > > > > > Is this master or slave mode? > > > > > > > > > > > > I guess practically it won't fire. It's more of a matter of logical > > > > > > order and: > > > > > > 1. Someone might fix the CONFIG_DEBUG_SHIRQ_FIXME one day, > > > > > > > > > > And what if CONFIG_DEBUG_SHIRQ_FIXME gets fixed? I uncommented it, and > > > > > still no issues. dspi_interrupt checks the status bit of the hw, sees > > > > > there's nothing to do, and returns IRQ_NONE. > > > > > > Indeed, still the logical way of initializing is to do it before any > > > possible use. > > > > > > > > > > > > > > 2. The hardware is actually initialized before and someone could attach > > > > > > to SPI bus some weird device. > > > > > > > > > > > > > > > > Some weird device that does what? > > > > > > You never know what people will connect to a SoM :). > > > > > > Wolfram made actually much better point - bootloaders are known to > > > initialize some things and leaving them in whatever state, assuming that > > > Linux kernel will redo any initialization properly. > > > > > > Best regards, > > > Krzysztof > > > > > > > I don't buy the argument. > > So ok, maybe some broken bootloader leaves a SPI_SR interrupt pending > > (do you have any example of that?). But the driver clears interrupts > > by writing SPI_SR_CLEAR in dspi_init (called _before_ requesting the > > IRQ). It clears 10 bits from the status register. There are 2 points > > to be made here: > > - The dspi_interrupt only handles data availability interrupt > > (SPI_SR_EOQF | SPI_SR_CMDTCF). Only then does it matter whether the > > completion was already initialized or not. But these interrupts _are_ > > cleared. But assume they weren't. What would Linux even do with a SPI > > transfer initiated by the previously running software environment? Why > > would it be a smart thing to handle that data in the first place? > > - The 10 bits from the status register are all the bits that can be > > cleared. The rest of the register, if you look at it, contains the TX > > FIFO Counter, the Transmit Next Pointer, the RX FIFO Counter, and the > > Pop Next Pointer. > > So, unless there's something I'm missing, I don't actually see how > > this broken bootloader can do any harm to us. > > Let's rephrase it: you think therefore that completion should be > initialzed *after* requesting shared interrupts? You think that exactly > that order shall be used in the source code? > > Best regards, > Krzysztof > > I think that completion should be initialized before it is used, just like any other variable. So far you have not proven any code path through which it can be used uninitialized, therefore I don't see why this should be accepted as a bug fix. Cleanup, cosmetic refactoring, design patterns, whatever, sure. Thanks, -Vladimir