On Sat, Apr 30, 2022 at 1:40 AM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: > > On Fri, Apr 29, 2022 at 11:43:11PM +0800, Hao Lee wrote: > > On Wed, Apr 20, 2022 at 7:38 PM Hao Lee <haolee.swjtu@xxxxxxxxx> wrote: > > > > > > On Tue, Apr 19, 2022 at 10:36:16AM -0700, Paul E. McKenney wrote: > > > > On Mon, Apr 18, 2022 at 06:23:59AM +0000, Hao Lee wrote: > > > > > On Sun, Apr 17, 2022 at 09:52:39AM -0700, Paul E. McKenney wrote: > > > > > > On Sun, Apr 17, 2022 at 11:09:47AM +0000, Hao Lee wrote: > > > > > > > CPU caches need to be flushed before DMA operations to avoid data > > > > > > > overwriting. > > > > > > > > > > > > > > Signed-off-by: Hao Lee <haolee.swjtu@xxxxxxxxx> > > > > > > > --- > > > > > > > appendix/whymb/whymemorybarriers.tex | 2 +- > > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > > > > > > > diff --git a/appendix/whymb/whymemorybarriers.tex b/appendix/whymb/whymemorybarriers.tex > > > > > > > index b49d5fe8..4ec27962 100644 > > > > > > > --- a/appendix/whymb/whymemorybarriers.tex > > > > > > > +++ b/appendix/whymb/whymemorybarriers.tex > > > > > > > @@ -1640,7 +1640,7 @@ future such problems: > > > > > > > in any DMA buffer before presenting that buffer to the > > > > > > > I/O device. > > > > > > > > > > > > The paragraph above is before the I/O starts. Any changes to the buffer > > > > > > need to be flushed to memory so that the I/O device's DMA actually > > > > > > sees them. > > > > > > > > > > > > The next paragraph is the other end of I/O request, after it completes. > > > > > > So we cannot have "before". But maybe we need "invalidate" instead of > > > > > > "flush". > > > > > > > > > > Got it! > > > > > > > > > > > With a note that itcan be more efficient to both flush and > > > > > > invalidate prior to presenting the DMA buffer to the device. > > > > > > > > > > > > The point is that any pre-DMA data in the CPU caches must be gotten > > > > > > rid of in order for the CPUs to see the data that was actually DMA'ed > > > > > > into memory. > > > > > > > > > > Great explanation! Now I know we need something like WBINVD instruction > > > > > for the former case and INVD instruction for the latter. > > > > > > > > Glad that it helped! > > > > > > > > > > > Similarly, you need to flush the CPU caches of any location > > > > > > > > > > Should we change the word "flush" to "invalidate" to differentiate these > > > > > two opposite DMA operations? > > > > > > > > If after the DMA, we need to care about the difference, hasn't the CPU > > > > incorrectly written to part of the DMA buffer during the DMA operation? > > > > > > Sorry, I can't understand this passage... Maybe I didn't make myself > > > clear. I find a great example in arch/powerpc/include/asm/cacheflush.h > > > > > > Before DMA provides output buffer to devices, we should call > > > flush_dcache_range or clean_dcache_range to let the devices see the > > > latest data generated by the CPU. After DMA writes data into the input > > > buffer, we should use invalidate_dcache_range to invalidate the CPU > > > cache so the CPU can read the latest data written by devices. > > > > > > Should we change the word "flush" in the following sentence to "invalidate"? > > > > > > Similarly, you need to flush the CPU caches of any location in > > > any DMA buffer after DMA to that buffer completes. > > > > > > > Hello, > > > > My above reply may be missed :) > > No, not missed, just slowness on my part. For whatever reason, the past > few months have seen unusually high levels of activity for Linux-kernel > RCU [1]. Plus I am presenting at the LSF/MM summit next week. > > But I will get to this, I promise! Oh, thanks! Take your time. > > In the meantime, one of the things that needs to be done to Appendix C to > bring it up to chapter status is to provide a state table for each of the > sequences of events, then converting the enumerated steps to paragraphs. > See for example Table 15.1 and the corresponding discussion. Would you > be interested in constructing such tables? That would speed things up! No problem! I'm glad I can contribute to this book! Thanks, Hao Lee > > Thanx, Paul > > [1] https://docs.google.com/document/d/1GCdQC8SDbb54W1shjEXqGZ0Rq8a6kIeYutdSIajfpLA/edit?usp=sharing > > > Thanks, > > Hao Lee > > > > > > > > > If you are thinking of some other scenario, could you please list the > > > > events leading up to that scenrio? > > > > > > > > > > > - in any DMA buffer after DMA to that buffer completes. > > > > > > > + in any DMA buffer before DMA to that buffer completes. > > > > > > > And even then, you need to be \emph{very} careful to avoid > > > > > > > pointer bugs, as even a misplaced read to an input buffer > > > > > > > can result in corrupting the data input! > > > > > > > > > > > > And this last clause could just wordsmithed a bit. > > > > > > > > > > Maybe it's not quite rigorous? > > > > > > > > > > If I understand correctly, a misplaced read will load the data from > > > > > input buffer area to cache line. If the cpu doesn't write to this cache > > > > > line, I think this case can only result in that the cpu reads stale data > > > > > in subsequent reads but won't corrupt the input buffer given that this > > > > > cache line is clean and won't writeback to memory. > > > > > > > > If the user code that accesses the buffer runs on this same CPU, it will > > > > look to that CPU as if the DMA buffer was corrupted, correct? > > > > > > Ah, this indeed makes sense! > > > > > > > After > > > > all, that user code will see the value loaded by the CPU, not the value > > > > actually DMAed into memory. > > > > > > > > Hmmm... It might be necessary to spell out these DMA-corruption > > > > scenarios. Thoughts? > > > > > > Yes. I think spelling out these scenarios should make this passage easier to > > > understand. > > > > > > Thanks, > > > Hao Lee > > > > > > > > > > > Thanx, Paul