Re: BUG in sctp crashes sles10sp2 kernel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Michal Hocko wrote:
> Hi Vlad,
> 
> I am starting this new thread because I am starting to believe that
> sles10sp2 kernel (based on 2.6.16 upstream kernel) experiences different
> issue than we can see in the upstream kernel (see bellow).
> 
> Karsten (CCing him) has found out following:
> "
> OK I think the
> KERNEL: assertion (!atomic_read(&sk->sk_wmem_alloc)) failed at
> net/ipv4/af_inet.c (149)
> 
> is related to the main problem here, it says that on the time a socket
> get destroyed here is still some wmem allocated. This mean here is still
> a transmit skb on the fly. Since sctp use skb destructors to do the
> memory accounting, this also means that after destroying the socket, the
> destructor of this skb will access the already freed socket struct,
> which will let in some cases (if the memory is in use again and the
> pointers are already overwritten) cause the crash with on
> {sock_wfree+48} (which is a call to sk->sk_write_space(sk);).  Of course
> it can crash in every other place, since the accounting may overwrite
> pointers in any other struct, which reuse this memory.
> 
> I instrument some routines with extra debug (eg. inet_sock_destruct) too
> see the amount of memory in sk->sk_wmem_alloc, it allmost show 
> 
> Dec 11 12:31:16 gw kernel: inet_sock_destruct:
> sk(ffff810116960e00)->sk_wmem_alloc 496
> Dec 11 12:31:17 gw kernel: inet_sock_destruct:
> sk(ffff8101144f1b00)->sk_wmem_alloc 496
> Dec 11 12:31:18 gw kernel: inet_sock_destruct:
> sk(ffff8101144f1b00)->sk_wmem_alloc -496
> Dec 11 12:31:20 gw kernel: inet_sock_destruct:
> sk(ffff81011d461a00)->sk_wmem_alloc 496
> Dec 11 12:31:21 gw kernel: inet_sock_destruct:
> sk(ffff81011d460080)->sk_wmem_alloc 496
> 
> Note the -496, I think this is a case in which the same memory was again
> allocated by a socket struct, so the memory still has valid pointers and
> so on the destructor call for the old socket it did decrement the memory
> on the new socket.
> 
> Do you agree with this analysis ?
> "
> 
> I am trying to go through git logs but maybe you remember some fix in
> this area.
> 
> If I understand correctly, then 20c2df83d25c6a95affe6157a4c9cac4cf5ffaac
> removes destructors from sctp completely, so the previous should not
> happen in upstream, shouldn't it?
> 


Here are a few commits that you need to check on:

61c9fed41638249f8b6ca5345064eb1beb50179f
[SCTP]: A better solution to fix therace between sctp_peeloff() and sctp_rcv().

cfdeef3282705a4b872d3559c4e7d2561251363c
[SCTP]: Unhash the endpoint in sctp_endpoint_free().

f26f7c480555812ca7c4037e0a50fa54afe2cb4a
[SCTP]: Add bind hash locking to the migrate code


All of the above commits address races in the SCTP code and are not in the base
2.6.16 kernel.

-vlad

--
To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Networking Development]     [Linux OMAP]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux