Re: [PATCHv2] sctp: Don't add the shutdown timer if its already been added

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 19, 2020 at 04:24:10PM -0300, Marcelo Ricardo Leitner wrote:
> On Tue, May 19, 2020 at 11:51:23AM +0300, Jere Leppänen wrote:
> > On Thu, 30 Apr 2020, Marcelo Ricardo Leitner wrote:
> > 
> > > On Wed, Apr 29, 2020 at 07:36:13AM -0400, Neil Horman wrote:
> > > > This BUG halt was reported a while back, but the patch somehow got
> > > > missed:
> > > > 
> > > > PID: 2879   TASK: c16adaa0  CPU: 1   COMMAND: "sctpn"
> > > >  #0 [f418dd28] crash_kexec at c04a7d8c
> > > >  #1 [f418dd7c] oops_end at c0863e02
> > > >  #2 [f418dd90] do_invalid_op at c040aaca
> > > >  #3 [f418de28] error_code (via invalid_op) at c08631a5
> > > >     EAX: f34baac0  EBX: 00000090  ECX: f418deb0  EDX: f5542950  EBP: 00000000
> > > >     DS:  007b      ESI: f34ba800  ES:  007b      EDI: f418dea0  GS:  00e0
> > > >     CS:  0060      EIP: c046fa5e  ERR: ffffffff  EFLAGS: 00010286
> > > >  #4 [f418de5c] add_timer at c046fa5e
> > > >  #5 [f418de68] sctp_do_sm at f8db8c77 [sctp]
> > > >  #6 [f418df30] sctp_primitive_SHUTDOWN at f8dcc1b5 [sctp]
> > > >  #7 [f418df48] inet_shutdown at c080baf9
> > > >  #8 [f418df5c] sys_shutdown at c079eedf
> > > >  #9 [f418df70] sys_socketcall at c079fe88
> > > >     EAX: ffffffda  EBX: 0000000d  ECX: bfceea90  EDX: 0937af98
> > > >     DS:  007b      ESI: 0000000c  ES:  007b      EDI: b7150ae4
> > > >     SS:  007b      ESP: bfceea7c  EBP: bfceeaa8  GS:  0033
> > > >     CS:  0073      EIP: b775c424  ERR: 00000066  EFLAGS: 00000282
> > > > 
> > > > It appears that the side effect that starts the shutdown timer was processed
> > > > multiple times, which can happen as multiple paths can trigger it.  This of
> > > > course leads to the BUG halt in add_timer getting called.
> > > > 
> > > > Fix seems pretty straightforward, just check before the timer is added if its
> > > > already been started.  If it has mod the timer instead to min(current
> > > > expiration, new expiration)
> > > > 
> > > > Its been tested but not confirmed to fix the problem, as the issue has only
> > > > occured in production environments where test kernels are enjoined from being
> > > > installed.  It appears to be a sane fix to me though.  Also, recentely,
> > > > Jere found a reproducer posted on list to confirm that this resolves the
> > > > issues
> > > > 
> > > > Signed-off-by: Neil Horman <nhorman@xxxxxxxxxxxxx>
> > > > CC: Vlad Yasevich <vyasevich@xxxxxxxxx>
> > > > CC: "David S. Miller" <davem@xxxxxxxxxxxxx>
> > > > CC: Jere Leppänen <jere.leppanen@xxxxxxxxx>
> > > > CC: marcelo.leitner@xxxxxxxxx
> > > > 
> > > > ---
> > > > Change notes:
> > > > V2) Updated to use timer_reduce
> > > 
> > > Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@xxxxxxxxx>
> > 
> > Hey is this patch falling through the cracks again? No rush, I'm just 
> > wondering what's going on.
> 
> Whoops, sounds like Neil forgot to Cc netdev@..
> 
>   Marcelo
> 
Crap, my bad.  I'm stuck in a call at the moment, but I'll resend this tomorrow
morning.
Thanks for following up!
Neil




[Index of Archives]     [Linux Networking Development]     [Linux OMAP]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux