Re: bonnie++ causing kernel panic

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Geert,

On Wed, Sep 30, 2020 at 1:16 PM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
>
> Hi Prabhakar,
>
> On Wed, Sep 30, 2020 at 11:48 AM Lad, Prabhakar
> <prabhakar.csengg@xxxxxxxxx> wrote:
> > On Tue, Sep 8, 2020 at 9:30 AM Lad, Prabhakar
> > <prabhakar.csengg@xxxxxxxxx> wrote:
> > > On Tue, Sep 8, 2020 at 8:05 AM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
> > > > On Mon, Sep 7, 2020 at 11:27 PM Lad, Prabhakar
> > > > <prabhakar.csengg@xxxxxxxxx> wrote:
> > > > > On Mon, Sep 7, 2020 at 1:05 PM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
> > > > > > On Fri, Sep 4, 2020 at 11:04 AM Lad, Prabhakar
> > > > > > <prabhakar.csengg@xxxxxxxxx> wrote:
> > > > > > > I am seeing "Unable to handle kernel paging request at virtual address
> > > > > > > xxxxxxxxxx" panic while running bonnie++ (version 1.04). I have
> > > > > > > managed to replicate this issue on R-Car M3N, G2[HMN]. I have been
> > > > > > > using renesas_defconfig for all the platforms and I have tested on
> > > > > > > Linux 5.9.0-rc3 for all the 4 platforms.
> > > > > > >
> > > > > > > Initially I was testing bonnie++ on eMMC device and later discovered
> > > > > > > even running bonnie++ on NFS mount is causing this issue. I have
> > > > > > > attached the logs for M3N while running bonnie++ on NFS and logs for
> > > > > > > G2N while running on eMMC.
> > > > > > >
> > > > > > > I even traced back to 5.2 kernel where initial G2M support was added
> > > > > > > and still able to see this issue.
> > > > > >
> > > > > > Thanks for your report!
> > > > > >
> > > > > > While the crash symptoms seem to be the same in all crash logs, the
> > > > > > backtraces aren't.
> > > > > >
> > > > > > Does disabling SMP (maxcpus=1) help?
> > > > > unfortunately no.
> > > >
> > > > OK, so it's not an SMP issue.
> > > >
> > > > > > Does switching from SLUB to SLAB, and enabling CONFIG_DEBUG_SLAB
> > > > > > reveal memory corruption?
> > > > > >
> > > > > attached are the logs for SLUB and SLAB with debug enabled on G2M
> > > > > rev.4.0 board (bonnie++-1.04) all the 4 combinations cause the kernel
> > > > > panic!
> > > > >
> > > > > SLUB -> 1 CPU -> BUG radix_tree_node (Not tainted): Padding overwritten.
> > > > > SLUB -> all 6 CPU's -> BUG kmalloc-2k (Not tainted): Padding overwritten.
> > > > >
> > > > > SLAB -> 1 CPU -> Slab corruption (Not tainted): nfs_write_data
> > > > > start=ffff000016c08840, len=912
> > > > > SLAB -> all 6 CPU's -> Unable to handle kernel paging request at
> > > > > virtual address 7d81858c9c9d9dd0 ([7d81858c9c9d9dd0] address between
> > > > > user and kernel address ranges)
> > > >
> > > > OK. So now we know something's overwriting its memory block.  Either
> > > > it's writing too far, or a use-after-free case.
> > > > Now comes the hard part of finding who's responsible...
> > > >
> > > :)
> > >
> > > I checked out the very first commit where support for G2M was added
> > > and tested even this had an issue, so now I'll switch to R-Car M3N and
> > > perform the tests. Unfortunately I don't have any  non Renesas arm64
> > > platform to perform similar tests.
> > >
> > To keep you posted, the issue has been cornered and is related to TFA
> > changes on RZ/G2x.
>
> Thanks for the update! I'm happy to hear it's not Linux' fault ;-)
>
:)

> Do you have more details about the TFA change? It might help in
> detecting similar issues in the future.
>
Not atm, will pass it on as soon as I have them.

Cheers,
Prabhakar



[Index of Archives]     [Linux Samsung SOC]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]

  Powered by Linux