On Thu, Oct 10, 2019 at 10:51:45AM +0200, Alexander Gordeev wrote: > On Wed, Oct 09, 2019 at 09:53:23PM +0300, Dan Carpenter wrote: > > > > > + u32 *rd_flags = hw->dma_desc_table_rd.cpu_addr->flags; > > > > > + u32 *wr_flags = hw->dma_desc_table_wr.cpu_addr->flags; > > > > > + struct avalon_dma_desc *desc; > > > > > + struct virt_dma_desc *vdesc; > > > > > + bool rd_done; > > > > > + bool wr_done; > > > > > + > > > > > + spin_lock(lock); > > > > > + > > > > > + rd_done = (hw->h2d_last_id < 0); > > > > > + wr_done = (hw->d2h_last_id < 0); > > > > > + > > > > > + if (rd_done && wr_done) { > > > > > + spin_unlock(lock); > > > > > + return IRQ_NONE; > > > > > + } > > > > > + > > > > > + do { > > > > > + if (!rd_done && rd_flags[hw->h2d_last_id]) > > > > > + rd_done = true; > > > > > + > > > > > + if (!wr_done && wr_flags[hw->d2h_last_id]) > > > > > + wr_done = true; > > > > > + } while (!rd_done || !wr_done); > > > > > > > > This loop is very strange. It feels like the last_id indexes needs > > > > to atomic or protected from racing somehow so we don't do an out of > > > > bounds read. > > [...] > > > You're missing my point. When we set > > hw->d2h_last_id = 1; > [1] > > ... > > hw->d2h_last_id = 2; > [2] > > > There is a tiny moment where ->d2h_last_id is transitioning from 1 to 2 > > where its value is unknown. We're in a busy loop here so we have a > > decent chance of hitting that 1/1000,000th of a second. If we happen to > > hit it at exactly the right time then we're reading from a random > > address and it will cause an oops. > > > > We have to use atomic_t types or something to handle race conditions. > > Err.. I am still missing the point :( In your example I do see a chance > for a reader to read out 1 at point in time [2] - because of SMP race. > But what could it be other than 1 or 2? > The 1 to 2 transition was a poorly chosen example, but a -1 to 1 trasition is better. The cpu could write a byte at a time. So maybe it only wrote the two highest bytes so now it's 0xffff. It's not -1 and it's not 1 and it's not a valid index. > Anyways, all code paths dealing with h2d_last_id and d2h_last_id indexes > are protected with a spinlock. You have to protect both the writer and the reader. (That's why this bug is so easy to spot). https://lwn.net/Articles/793253/ regards, dan carpenter