On Mon, 2018-12-17 at 19:55 +0100, Geert Uytterhoeven wrote: > Hi Trond, > > (For the newly added CCs, first message was > https://lore.kernel.org/lkml/CAMuHMdVJr0PwvJg3FeTCy7vxuyY1=S1tPLHO7hPsoZX4wZ+-cQ@xxxxxxxxxxxxxx/) > > > On Mon, Dec 17, 2018 at 3:51 PM Trond Myklebust < > > trondmy@xxxxxxxxxxxxxxx> wrote: > > > On Mon, 2018-12-17 at 15:03 +0100, Geert Uytterhoeven wrote: > > > > On Wed, Dec 5, 2018 at 3:47 PM Geert Uytterhoeven < > > > > geert@xxxxxxxxxxxxxx> wrote: > > > > > On Wed, Dec 5, 2018 at 2:45 PM Trond Myklebust < > > > > > trondmy@xxxxxxxxxxxxxxx> wrote: > > > > > > On Wed, 2018-12-05 at 14:41 +0100, Geert Uytterhoeven > > > > > > wrote: > > > > > > > On Wed, Dec 5, 2018 at 2:11 PM Atsushi Nemoto < > > > > > > > anemo@xxxxxxxxxxxxx> > > > > > > > wrote: > > > > > > > > On Tue, 4 Dec 2018 14:53:07 +0100, Geert Uytterhoeven < > > > > > > > > geert@xxxxxxxxxxxxxx> wrote: > > > > > > > > > I found similar crashes in a report from 2006, but of > > > > > > > > > course the > > > > > > > > > code > > > > > > > > > has changed too much to apply the solution proposed > > > > > > > > > there > > > > > > > > > ( > > > > > > > > > https://www.linux-mips.org/archives/linux-mips/2006-09/msg00169.html > > > > > > > > > ). > > > > > > > > > > > > > > > > > > Userland is Debian 8 (the last release supporting > > > > > > > > > "old" > > > > > > > > > MIPS). > > > > > > > > > My kernel is based on v4.20.0-rc5, but the issue > > > > > > > > > happens > > > > > > > > > with > > > > > > > > > v4.20-rc1, > > > > > > > > > too. > > > > > > > > > > > > > > > > > > However, I noticed it works in v4.19! Hence I've > > > > > > > > > bisected > > > > > > > > > this, > > > > > > > > > to commit > > > > > > > > > 277e4ab7d530bf28 ("SUNRPC: Simplify TCP receive code > > > > > > > > > by > > > > > > > > > switching > > > > > > > > > to using > > > > > > > > > iterators"). > > > > > > > > > > > > > > > > > > Dropping the ",tcp" part from the nfsroot parameter > > > > > > > > > also > > > > > > > > > fixes > > > > > > > > > the issue. > > > > > > > > > > > > > > > > > > Given RBTX4927 is little endian, just like my > > > > > > > > > arm/arm64 > > > > > > > > > boards, > > > > > > > > > it's probably > > > > > > > > > not an endianness issue. Sparse didn't show anything > > > > > > > > > suspicious > > > > > > > > > before/after > > > > > > > > > the guilty commit. > > > > > > > > > > > > > > > > > > Do you have a clue? > > > > > > > > > > > > > > > > If it was a cache issue, disabling i-cache or d-cache > > > > > > > > completely > > > > > > > > might > > > > > > > > help understanding the problem. I added TXx9 specific > > > > > > > > "icdisable" > > > > > > > > and > > > > > > > > "dcdisable" kernel options for debugging long ago. > > > > > > > > > > > > > > > > I hope these options still works correctly with recent > > > > > > > > kernel > > > > > > > > but > > > > > > > > not > > > > > > > > sure. > > > > > > > > > > > > > > > > Also, disabling i-cache makes your board VERY slow, of > > > > > > > > course. > > > > > > > > > > > > > > Thanks! > > > > > > > > > > > > > > When using these options, I do see a slowdown in early > > > > > > > boot, > > > > > > > but the > > > > > > > issue > > > > > > > is still there. > > > > > > > > > > > > > > My next guess is an unaligned access not using > > > > > > > {get,put}_unaligned(), > > > > > > > which > > > > > > > doesn't seem to work on tx4927, but doesn't cause an > > > > > > > exception > > > > > > > neither. > > > > > > > > > > > > Can you try my linux-next branch on git.linux-nfs.org? It > > > > > > contains a > > > > > > fixes for a hang that results from the above commit. > > > > > > > > > > > > git pull git://git.linux-nfs.org/projects/trondmy/linux- > > > > > > nfs.git > > > > > > linux-next > > > > > > > > > > Thanks for the suggestion, but unfortunately it doesn't help. > > > > > > > > In the mean time, I tried your newer linux-next, no change. > > > > I tried several other things: > > > > - remove the packed attribute (why did you add that?), > > > > > > The packed attribute allows us to avoid a series of copy > > > operations > > > when decoding the first three elements of a RPC over TCP header > > > (which > > > is why they are all declared as big endian). The alternative > > > would be > > > to have a 12 byte buffer there for temporary storage, and then a > > > duplicate set of 3 32-bit words into which we copy the buffer > > > contents > > > after extracting them from the (non-blocking) socket. > > > > > > > - verify (at runtime) that all accesses to fraghdr, xid, and > > > > calldir > > > > are aligned, > > > > - enable RPC_DEBUG_DATA, nothing fishy seen at first sight. > > > > > > > > Is anyone else seeing this on MIPS, or any other platform? > > > > Does mounting NFS with -o nfsvers=3,tcp work on other MIPS > > > > platforms? > > > > > > I have no access to any MIPS hardware for the purposes of testing > > > so > > > that would be a question for the community. > > > > > > One thing that I have noticed is that unlike the old code, the > > > bvec > > > 'generic' code does appear to fail to call flush_dcache_page(). > > > Could > > > that be causing the problem here? If so, why would that not be a > > > problem in the context of regular block I/O? > > Thanks for the hint! > > It wasn't clear to me where exactly the old code called > flush_dcache_page(), but as rpcrdma_inline_fixup() calls it in > between > copying to a page, and unmapping the page, I added a call to > flush_dcache_page() to all functions in lib/iov_iter.c that map a > page > and copy to it, cfr. the patch below. > > And suddenly NFS root over TCP is working again! Hah! ☺ > > Note that I have no idea if it affects regular block I/O, as my > RBTX4927 > does not have block devices. > > Also note that this platform does not use highmem. > > So, where's the proper place to fix this? > Thanks in advance! Given that one of the main use cases for iov_iter is the page cache, I think that your patch below is the correct one. However perhaps Al can comment? > > diff --git a/lib/iov_iter.c b/lib/iov_iter.c > index 54c248526b55fc49..5be62db33414d3f9 100644 > --- a/lib/iov_iter.c > +++ b/lib/iov_iter.c > @@ -277,6 +277,7 @@ static size_t copy_page_from_iter_iovec(struct > page *page, size_t offset, size_t > to += copy; > bytes -= copy; > } > + flush_dcache_page(page); > if (likely(!bytes)) { > kunmap_atomic(kaddr); > goto done; > @@ -463,6 +464,7 @@ static void memcpy_to_page(struct page *page, > size_t offset, const char *from, s > { > char *to = kmap_atomic(page); > memcpy(to + offset, from, len); > + flush_dcache_page(page); > kunmap_atomic(to); > } > > @@ -470,6 +472,7 @@ static void memzero_page(struct page *page, > size_t offset, size_t len) > { > char *addr = kmap_atomic(page); > memset(addr + offset, 0, len); > + flush_dcache_page(page); > kunmap_atomic(addr); > } > > @@ -580,6 +583,7 @@ static size_t csum_and_copy_to_pipe_iter(const > void *addr, size_t bytes, > char *p = kmap_atomic(pipe->bufs[idx].page); > next = csum_partial_copy_nocheck(addr, p + r, chunk, > 0); > sum = csum_block_add(sum, next, off); > + flush_dcache_page(pipe->bufs[idx].page); > kunmap_atomic(p); > i->idx = idx; > i->iov_offset = r + chunk; > @@ -628,6 +632,7 @@ static unsigned long memcpy_mcsafe_to_page(struct > page *page, size_t offset, > > to = kmap_atomic(page); > ret = memcpy_mcsafe(to + offset, from, len); > + flush_dcache_page(page); > kunmap_atomic(to); > > return ret; > @@ -894,6 +899,7 @@ size_t copy_page_from_iter(struct page *page, > size_t offset, size_t bytes, > if (i->type & (ITER_BVEC|ITER_KVEC)) { > void *kaddr = kmap_atomic(page); > size_t wanted = _copy_from_iter(kaddr + offset, bytes, > i); > + flush_dcache_page(page); > kunmap_atomic(kaddr); > return wanted; > } else > @@ -958,6 +964,7 @@ size_t iov_iter_copy_from_user_atomic(struct page > *page, > v.bv_offset, v.bv_len), > memcpy((p += v.iov_len) - v.iov_len, v.iov_base, > v.iov_len) > ) > + flush_dcache_page(page); > kunmap_atomic(kaddr); > return bytes; > } > @@ -1494,6 +1501,7 @@ size_t csum_and_copy_to_iter(const void *addr, > size_t bytes, __wsum *csum, > next = csum_partial_copy_nocheck((from += v.bv_len) - > v.bv_len, > p + v.bv_offset, > v.bv_len, 0); > + flush_dcache_page(v.bv_page); > kunmap_atomic(p); > sum = csum_block_add(sum, next, off); > off += v.bv_len; > > Gr{oetje,eeting}s, > > Geert > > -- > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- > geert@xxxxxxxxxxxxxx > -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx