Re: Floating point register corruption on ARM Cortex A57 (ARMv8) with RT_PREEMPT linux

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Just wanted to give a quick update. Since, I could not get a 4.18 BSP
from the vendor, I could not move to 4.18 kernel as they only release
BSP's for LTS releases. I did a diff of arch/arm64/kernel/fpsimd.c
between my 4.14 version and the 4.18. I did not port SVE part, but
just ported back the preempt_enable/preempt_disable in fpsimd* around
local_bh_enable/disable. With that fix, I do not see the floating
point corruption anymore.
On Sun, Oct 7, 2018 at 10:35 PM Anup Pemmaiah <anup.pemmaiah@xxxxxxxxx> wrote:
>
> Some more observations with RT_PREEMPT configs enabled.
>
> 1) I re-ran the tests disabling all crypto including NEON related
> crypto and EFI kernel config options. I still see randomly floating
> point register getting corrupted
>
> 2) I noticed that, when I run the tests with RT schedulers and RT
> priorities, eg: ("chrt -f 5 ./test_float" or "chrt -r 5
> ./test_float"),  I am not able  reproduce the corruption issue. But,
> when I run the tests (just ./test_float) without any RT scheduler and
> priority  (i.e SCHED_OTHER) can easily reproduce the issue.
>
> I tried disabling PREEMPT_LAZY, by "echo NO_PREEMPT_LAZY >
> /sys/kernel/debug/sched_features". It did not help and am able to
> reproduce the problem
>
> 3) I have another Cortex ARM A57 system from a different vendor(cannot
> name the vendors because of proprietary reasons) with Linux kernel
> version 4.9.38 and RT_PREEMPT enabled. I do not see any floating point
> corruption issue, even if I run the test as SCHED_OTHER or with real
> time settings. So, that tells me moving to 4.18 may not help. What do
> you think?
>
> Thanks
> Anup
> On Sun, Oct 7, 2018 at 9:58 AM Anup Pemmaiah <anup.pemmaiah@xxxxxxxxx> wrote:
> >
> > > nope, should work by default. Do you have NEON related crypto code or
> > > EFI enabled?
> >
> > Sebastian, Thank you for the comments. I have NEON related crypto code
> > enabled right now, but I remember disabling
> > it and it did not make a difference. I will disable it again and will
> > give it a try. In the mean time, when I disabled the following 4 lines
> > from the config file
> > and re-compiled the kernel, the test code works fine without the issue
> > described earlier related to floating point. Are you suspecting that
> > NEON related crypto interferes with real time kernel and not with non-rt kernel?
> >
> >
> >   # CONFIG_PREEMPT_RT_BASE=y
> >
> >   # CONFIG_HAVE_PREEMPT_LAZY=y
> >
> >   # CONFIG_PREEMPT_LAZY=y
> >
> >   # CONFIG_PREEMPT_RT_FULL=y
> >
> >
> > > Could you please try the latest v4.18? I believe it is fixed there and
> > > needs just backporting. Could you please try?
> >
> > I will try it as a last resort because I am not sure if the board BSP
> > supports v4.18. Right now, I am
> > trying to figure out, why it works fine with non-rt kernel and only
> > see the issue when the above four RT_PREEMPT config
> > options are turned on.
> >
> >
> > On Fri, Oct 5, 2018, 9:55 AM Sebastian Andrzej Siewior
> > <sebastian.siewior@xxxxxxxxxxxxx> wrote:
> > >
> > > On 2018-10-04 19:12:53 [-0700], Anup Pemmaiah wrote:
> > > > 1) Is there any floating point related kernel setting that I should
> > > > set in the RT_PREEMPT kernel? I have set eagerfpu=on (even though it
> > > > is on by default)
> > >
> > > nope, should work by default. Do you have NEON related crypto code or
> > > EFI enabled?
> > >
> > > > 2) Was reading about "Lazy Stacking" for Cortex-M4, but not sure if it
> > > > applies for Cortex A57
> > > >
> > > > Any comments will be greatly appreciated.
> > >
> > > Could you please try the latest v4.18? I believe it is fixed there and
> > > needs just backporting. Could you please try?
> > >
> > >
> > > Sebastian
> >
> >
> > On Fri, Oct 5, 2018 at 9:55 AM Sebastian Andrzej Siewior
> > <sebastian.siewior@xxxxxxxxxxxxx> wrote:
> > >
> > > On 2018-10-04 19:12:53 [-0700], Anup Pemmaiah wrote:
> > > > 1) Is there any floating point related kernel setting that I should
> > > > set in the RT_PREEMPT kernel? I have set eagerfpu=on (even though it
> > > > is on by default)
> > >
> > > nope, should work by default. Do you have NEON related crypto code or
> > > EFI enabled?
> > >
> > > > 2) Was reading about "Lazy Stacking" for Cortex-M4, but not sure if it
> > > > applies for Cortex A57
> > > >
> > > > Any comments will be greatly appreciated.
> > >
> > > Could you please try the latest v4.18? I believe it is fixed there and
> > > needs just backporting. Could you please try?
> > >
> > > Sebastian



[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux