Hi Geert, On Wed, Sep 30, 2020 at 1:16 PM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote: > > Hi Prabhakar, > > On Wed, Sep 30, 2020 at 11:48 AM Lad, Prabhakar > <prabhakar.csengg@xxxxxxxxx> wrote: > > On Tue, Sep 8, 2020 at 9:30 AM Lad, Prabhakar > > <prabhakar.csengg@xxxxxxxxx> wrote: > > > On Tue, Sep 8, 2020 at 8:05 AM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote: > > > > On Mon, Sep 7, 2020 at 11:27 PM Lad, Prabhakar > > > > <prabhakar.csengg@xxxxxxxxx> wrote: > > > > > On Mon, Sep 7, 2020 at 1:05 PM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote: > > > > > > On Fri, Sep 4, 2020 at 11:04 AM Lad, Prabhakar > > > > > > <prabhakar.csengg@xxxxxxxxx> wrote: > > > > > > > I am seeing "Unable to handle kernel paging request at virtual address > > > > > > > xxxxxxxxxx" panic while running bonnie++ (version 1.04). I have > > > > > > > managed to replicate this issue on R-Car M3N, G2[HMN]. I have been > > > > > > > using renesas_defconfig for all the platforms and I have tested on > > > > > > > Linux 5.9.0-rc3 for all the 4 platforms. > > > > > > > > > > > > > > Initially I was testing bonnie++ on eMMC device and later discovered > > > > > > > even running bonnie++ on NFS mount is causing this issue. I have > > > > > > > attached the logs for M3N while running bonnie++ on NFS and logs for > > > > > > > G2N while running on eMMC. > > > > > > > > > > > > > > I even traced back to 5.2 kernel where initial G2M support was added > > > > > > > and still able to see this issue. > > > > > > > > > > > > Thanks for your report! > > > > > > > > > > > > While the crash symptoms seem to be the same in all crash logs, the > > > > > > backtraces aren't. > > > > > > > > > > > > Does disabling SMP (maxcpus=1) help? > > > > > unfortunately no. > > > > > > > > OK, so it's not an SMP issue. > > > > > > > > > > Does switching from SLUB to SLAB, and enabling CONFIG_DEBUG_SLAB > > > > > > reveal memory corruption? > > > > > > > > > > > attached are the logs for SLUB and SLAB with debug enabled on G2M > > > > > rev.4.0 board (bonnie++-1.04) all the 4 combinations cause the kernel > > > > > panic! > > > > > > > > > > SLUB -> 1 CPU -> BUG radix_tree_node (Not tainted): Padding overwritten. > > > > > SLUB -> all 6 CPU's -> BUG kmalloc-2k (Not tainted): Padding overwritten. > > > > > > > > > > SLAB -> 1 CPU -> Slab corruption (Not tainted): nfs_write_data > > > > > start=ffff000016c08840, len=912 > > > > > SLAB -> all 6 CPU's -> Unable to handle kernel paging request at > > > > > virtual address 7d81858c9c9d9dd0 ([7d81858c9c9d9dd0] address between > > > > > user and kernel address ranges) > > > > > > > > OK. So now we know something's overwriting its memory block. Either > > > > it's writing too far, or a use-after-free case. > > > > Now comes the hard part of finding who's responsible... > > > > > > > :) > > > > > > I checked out the very first commit where support for G2M was added > > > and tested even this had an issue, so now I'll switch to R-Car M3N and > > > perform the tests. Unfortunately I don't have any non Renesas arm64 > > > platform to perform similar tests. > > > > > To keep you posted, the issue has been cornered and is related to TFA > > changes on RZ/G2x. > > Thanks for the update! I'm happy to hear it's not Linux' fault ;-) > :) > Do you have more details about the TFA change? It might help in > detecting similar issues in the future. > Not atm, will pass it on as soon as I have them. Cheers, Prabhakar