Hi Geert, On Mon, 2018-12-17 at 15:03 +0100, Geert Uytterhoeven wrote: > Hi Trond, > > On Wed, Dec 5, 2018 at 3:47 PM Geert Uytterhoeven < > geert@xxxxxxxxxxxxxx> wrote: > > On Wed, Dec 5, 2018 at 2:45 PM Trond Myklebust < > > trondmy@xxxxxxxxxxxxxxx> wrote: > > > On Wed, 2018-12-05 at 14:41 +0100, Geert Uytterhoeven wrote: > > > > On Wed, Dec 5, 2018 at 2:11 PM Atsushi Nemoto < > > > > anemo@xxxxxxxxxxxxx> > > > > wrote: > > > > > On Tue, 4 Dec 2018 14:53:07 +0100, Geert Uytterhoeven < > > > > > geert@xxxxxxxxxxxxxx> wrote: > > > > > > I found similar crashes in a report from 2006, but of > > > > > > course the > > > > > > code > > > > > > has changed too much to apply the solution proposed there > > > > > > ( > > > > > > https://www.linux-mips.org/archives/linux-mips/2006-09/msg00169.html > > > > > > ). > > > > > > > > > > > > Userland is Debian 8 (the last release supporting "old" > > > > > > MIPS). > > > > > > My kernel is based on v4.20.0-rc5, but the issue happens > > > > > > with > > > > > > v4.20-rc1, > > > > > > too. > > > > > > > > > > > > However, I noticed it works in v4.19! Hence I've bisected > > > > > > this, > > > > > > to commit > > > > > > 277e4ab7d530bf28 ("SUNRPC: Simplify TCP receive code by > > > > > > switching > > > > > > to using > > > > > > iterators"). > > > > > > > > > > > > Dropping the ",tcp" part from the nfsroot parameter also > > > > > > fixes > > > > > > the issue. > > > > > > > > > > > > Given RBTX4927 is little endian, just like my arm/arm64 > > > > > > boards, > > > > > > it's probably > > > > > > not an endianness issue. Sparse didn't show anything > > > > > > suspicious > > > > > > before/after > > > > > > the guilty commit. > > > > > > > > > > > > Do you have a clue? > > > > > > > > > > If it was a cache issue, disabling i-cache or d-cache > > > > > completely > > > > > might > > > > > help understanding the problem. I added TXx9 specific > > > > > "icdisable" > > > > > and > > > > > "dcdisable" kernel options for debugging long ago. > > > > > > > > > > I hope these options still works correctly with recent kernel > > > > > but > > > > > not > > > > > sure. > > > > > > > > > > Also, disabling i-cache makes your board VERY slow, of > > > > > course. > > > > > > > > Thanks! > > > > > > > > When using these options, I do see a slowdown in early boot, > > > > but the > > > > issue > > > > is still there. > > > > > > > > My next guess is an unaligned access not using > > > > {get,put}_unaligned(), > > > > which > > > > doesn't seem to work on tx4927, but doesn't cause an exception > > > > neither. > > > > > > Can you try my linux-next branch on git.linux-nfs.org? It > > > contains a > > > fixes for a hang that results from the above commit. > > > > > > git pull git://git.linux-nfs.org/projects/trondmy/linux-nfs.git > > > linux-next > > > > Thanks for the suggestion, but unfortunately it doesn't help. > > In the mean time, I tried your newer linux-next, no change. > I tried several other things: > - remove the packed attribute (why did you add that?), The packed attribute allows us to avoid a series of copy operations when decoding the first three elements of a RPC over TCP header (which is why they are all declared as big endian). The alternative would be to have a 12 byte buffer there for temporary storage, and then a duplicate set of 3 32-bit words into which we copy the buffer contents after extracting them from the (non-blocking) socket. > - verify (at runtime) that all accesses to fraghdr, xid, and > calldir > are aligned, > - enable RPC_DEBUG_DATA, nothing fishy seen at first sight. > > Is anyone else seeing this on MIPS, or any other platform? > Does mounting NFS with -o nfsvers=3,tcp work on other MIPS platforms? I have no access to any MIPS hardware for the purposes of testing so that would be a question for the community. One thing that I have noticed is that unlike the old code, the bvec 'generic' code does appear to fail to call flush_dcache_page(). Could that be causing the problem here? If so, why would that not be a problem in the context of regular block I/O? Cheers Trond -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx