Re: System crash in netfilter 5.10.25

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Florian,

I need assistance on this one. Our customer system 5.10.25-flatcar crashed with following trace

Aug 26 10:26:32.686733 amc-k8sdevsl01-worker-lx13 kernel: ------------[ cut here ]------------
Aug 26 10:26:32.686855 amc-k8sdevsl01-worker-lx13 kernel: refcount_t: underflow; use-after-free.
Aug 26 10:26:32.686877 amc-k8sdevsl01-worker-lx13 kernel: WARNING: CPU: 4 PID: 2422635 at lib/refcount.c:28 refcount_warn_saturat>
Aug 26 10:26:32.686930 amc-k8sdevsl01-worker-lx13 kernel: Modules linked in: binfmt_misc nfnetlink_queue xt_NFQUEUE xt_multiport >
Aug 26 10:26:32.689906 amc-k8sdevsl01-worker-lx13 kernel:  dm_region_hash dm_log dm_mod
Aug 26 10:26:32.690398 amc-k8sdevsl01-worker-lx13 kernel: CPU: 4 PID: 2422635 Comm: worker-1 Not tainted 5.10.25-flatcar #1
Aug 26 10:26:32.690526 amc-k8sdevsl01-worker-lx13 kernel: Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Refer>
Aug 26 10:26:32.691653 amc-k8sdevsl01-worker-lx13 kernel: RIP: 0010:refcount_warn_saturate+0xa6/0xf0
Aug 26 10:26:32.691720 amc-k8sdevsl01-worker-lx13 kernel: Code: 05 3c 1d 40 01 01 e8 81 46 38 00 0f 0b c3 80 3d 2a 1d 40 01 00 75>
Aug 26 10:26:32.691747 amc-k8sdevsl01-worker-lx13 kernel: RSP: 0018:ffffa3a0c3627938 EFLAGS: 00010282
Aug 26 10:26:32.692385 amc-k8sdevsl01-worker-lx13 kernel: RAX: 0000000000000000 RBX: ffff8c011b14fa00 RCX: 0000000000000027
Aug 26 10:26:32.692422 amc-k8sdevsl01-worker-lx13 kernel: RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff8c045d918b08
Aug 26 10:26:32.692446 amc-k8sdevsl01-worker-lx13 kernel: RBP: ffff8c011b14fa00 R08: ffff8c045d918b00 R09: ffffa3a0c3627750
Aug 26 10:26:32.693526 amc-k8sdevsl01-worker-lx13 kernel: R10: 0000000000000001 R11: 0000000000000001 R12: ffff8c011b14fa30
Aug 26 10:26:32.693584 amc-k8sdevsl01-worker-lx13 kernel: R13: 0000000000000002 R14: ffff8bfda3b43180 R15: ffff8c00cddb3a00
Aug 26 10:26:32.693615 amc-k8sdevsl01-worker-lx13 kernel: FS:  00007ff7a2331b38(0000) GS:ffff8c045d900000(0000) knlGS:00000000000>
Aug 26 10:26:32.693649 amc-k8sdevsl01-worker-lx13 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 26 10:26:32.694304 amc-k8sdevsl01-worker-lx13 kernel: CR2: 00007ff79ac17a28 CR3: 00000001ee34e003 CR4: 00000000007706e0
Aug 26 10:26:32.694334 amc-k8sdevsl01-worker-lx13 kernel: PKRU: 55555554
Aug 26 10:26:32.694351 amc-k8sdevsl01-worker-lx13 kernel: Call Trace:
Aug 26 10:26:32.694370 amc-k8sdevsl01-worker-lx13 kernel:  nf_queue_entry_release_refs+0x82/0xa0
Aug 26 10:26:32.695381 amc-k8sdevsl01-worker-lx13 kernel:  nf_reinject+0x6f/0x1a0
Aug 26 10:26:32.695404 amc-k8sdevsl01-worker-lx13 kernel:  0xffffffffc0857980
Aug 26 10:26:32.695425 amc-k8sdevsl01-worker-lx13 kernel:  nfnetlink_unicast+0x1f1/0x420 [nfnetlink]
Aug 26 10:26:32.695441 amc-k8sdevsl01-worker-lx13 kernel:  ? cred_has_capability+0x7f/0x120
Aug 26 10:26:32.695457 amc-k8sdevsl01-worker-lx13 kernel:  ? nfnetlink_unicast+0xa0/0x420 [nfnetlink]
Aug 26 10:26:32.695475 amc-k8sdevsl01-worker-lx13 kernel:  netlink_rcv_skb+0x50/0x100
Aug 26 10:26:32.696440 amc-k8sdevsl01-worker-lx13 kernel:  nfnetlink_subsys_register+0x789/0x869 [nfnetlink]
Aug 26 10:26:32.696465 amc-k8sdevsl01-worker-lx13 kernel:  netlink_unicast+0x191/0x230
Aug 26 10:26:32.696492 amc-k8sdevsl01-worker-lx13 kernel:  netlink_sendmsg+0x243/0x480
Aug 26 10:26:32.696513 amc-k8sdevsl01-worker-lx13 kernel:  sock_sendmsg+0x5e/0x60
Aug 26 10:26:32.696529 amc-k8sdevsl01-worker-lx13 kernel:  ____sys_sendmsg+0x1f3/0x260
Aug 26 10:26:32.697288 amc-k8sdevsl01-worker-lx13 kernel:  ? copy_msghdr_from_user+0x5c/0x90
Aug 26 10:26:32.697309 amc-k8sdevsl01-worker-lx13 kernel:  ? _cond_resched+0x15/0x30
Aug 26 10:26:32.697329 amc-k8sdevsl01-worker-lx13 kernel:  ___sys_sendmsg+0x81/0xc0
Aug 26 10:26:32.697348 amc-k8sdevsl01-worker-lx13 kernel:  ? do_lock_file_wait+0x6e/0xe0
Aug 26 10:26:32.697370 amc-k8sdevsl01-worker-lx13 kernel:  ? _cond_resched+0x15/0x30
Aug 26 10:26:32.698946 amc-k8sdevsl01-worker-lx13 kernel:  ? fcntl_setlk+0x1a5/0x2d0
Aug 26 10:26:32.698988 amc-k8sdevsl01-worker-lx13 kernel:  __sys_sendmsg+0x59/0xa0
Aug 26 10:26:32.699005 amc-k8sdevsl01-worker-lx13 kernel:  do_syscall_64+0x33/0x40
Aug 26 10:26:32.699020 amc-k8sdevsl01-worker-lx13 kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Aug 26 10:26:32.699039 amc-k8sdevsl01-worker-lx13 kernel: RIP: 0033:0x7ff7ab1283ad
Aug 26 10:26:32.699071 amc-k8sdevsl01-worker-lx13 kernel: Code: c3 8b 07 85 c0 75 24 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2>
Aug 26 10:26:32.699090 amc-k8sdevsl01-worker-lx13 kernel: RSP: 002b:00007ff7a232f9f8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
Aug 26 10:26:32.699505 amc-k8sdevsl01-worker-lx13 kernel: RAX: ffffffffffffffda RBX: 00007ff7a2331b38 RCX: 00007ff7ab1283ad
Aug 26 10:26:32.699534 amc-k8sdevsl01-worker-lx13 kernel: RDX: 0000000000000000 RSI: 00007ff7a232fa48 RDI: 0000000000000078
Aug 26 10:26:09.088408 amc-k8sdevsl01-worker-lx13 kernel: SELinux:  Class xdp_socket not defined in policy.

Is there a fix available for that crash?

Thank you,
Yuri


> On Dec 3, 2020, at 12:00 PM, Yuri Lipnesh <yuri.lipnesh@xxxxxxxxx> wrote:
> 
> Seems that upgrade to Linux 5.7 solved the problem, we will run more tests.
> Thank you,
> Yuri 
> 
>> On Nov 30, 2020, at 2:58 PM, Florian Westphal <fw@xxxxxxxxx> wrote:
>> 
>> Yuri Lipnesh <yuri.lipnesh@xxxxxxxxx> wrote:
>>> Linux system crashed
>>> 
>>> [    0.000000] Linux version 5.4.0-54-generic (buildd@lcy01-amd64-008) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #60~18.04.1-Ubuntu SMP Fri Nov 6 17:25:16 UTC 2020 (Ubuntu 5.4.0-54.60~18.04.1-generic 5.4.65)
>>> [    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.4.0-54-generic root=UUID=11885fd3-b840-4c9b-a500-532c73ac952a ro find_preseed=/preseed.cfg auto noprompt priority=critical locale=en_US quiet crashkernel=512M-:192M
>>> 
>>> …
>>> [  156.321147] TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised.
>>> [  177.519159] general protection fault: 0000 [#1] SMP PTI
>>> [  177.519737] CPU: 5 PID: 18484 Comm: worker-1 Kdump: loaded Not tainted 5.4.0-54-generic #60~18.04.1-Ubuntu
>>> [  177.519742] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 02/27/2020
>>> [  177.519814] RIP: 0010:dev_hard_start_xmit+0x38/0x200
>>> [  177.519827] Code: 55 41 54 53 48 83 ec 20 48 85 ff 48 89 55 c8 48 89 4d b8 0f 84 c1 01 00 00 48 8d 86 90 00 00 00 48 89 fb 49 89 f4 48 89 45 c0 <4c> 8b 2b 48 c7 c0 d0 f2 04 8f 48 c7 03 00 00 00 00 48 8b 00 4d 85
>>> [  177.519829] RSP: 0018:ffffbc6d0609b5e8 EFLAGS: 00010286
>>> [  177.519833] RAX: 0000000000000000 RBX: dead000000000100 RCX: ffff95cf4bcfe800
>>> [  177.519835] RDX: 0000000000000000 RSI: ffff95cf4bcfe800 RDI: 0000000000000286
>>> [  177.519837] RBP: ffffbc6d0609b630 R08: ffff95cf6a190ec8 R09: ffff95cf4a2f7438
>>> [  177.519839] R10: ffffbc6d0609b6d0 R11: ffff95cf49d4d180 R12: ffff95cf51a5f000
>>> [  177.519841] R13: dead000000000100 R14: 000000000000009c R15: ffff95d02996b400
>>> [  177.519844] FS:  00007ff394cdfb20(0000) GS:ffff95d035d40000(0000) knlGS:0000000000000000
>>> [  177.519846] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [  177.519848] CR2: 00007fb4a9c2d000 CR3: 00000001049fa004 CR4: 00000000003606e0
>>> [  177.519908] Call Trace:
>>> [  177.519917]  __dev_queue_xmit+0x719/0x920
>>> [  177.519930]  ? ctnetlink_conntrack_event+0x8c/0x5e0 [nf_conntrack_netlink]
>> 
>> Can you reproduce this on 5.7 or later, or with following patches
>> backported to 5.4.y?
>> 
>> dd3cc111f2e3220ddc9c4ab17f13dc97759b5163
>> 119e52e664c57d5f7c0174dc2b3a296b1e40591d
>> af370ab36fcd19f04e3408c402608e7e56e6f188
>> 28f715b9e6dd7cbf07c2aea913fea7c87a56a3b5
>> 
>> The series fixed nfqueue reference counting.
> 





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux