Re: [PATCH 4.9.y] netfilter: nat: Revert "netfilter: nat: convert nat bysrc hash to rhashtable"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



the following patch based on the suggestion by Guillaume works good in my tests and does not crash anymore

--- nf_nat_core.c       (Revision 33766)
+++ nf_nat_core.c       (Arbeitskopie)
@@ -678,16 +678,11 @@
 /* No one using conntrack by the time this called. */
 static void nf_nat_cleanup_conntrack(struct nf_conn *ct)
 {
-       struct nf_conn_nat *nat = nf_ct_ext_find(ct, NF_CT_EXT_NAT);
-
-       if (!nat)
-               return;
-
-       NF_CT_ASSERT(ct->status & IPS_SRC_NAT_DONE);
-
-       spin_lock_bh(&nf_nat_lock);
-       hlist_del_rcu(&ct->nat_bysource);
-       spin_unlock_bh(&nf_nat_lock);
+       if (ct->status & IPS_SRC_NAT_DONE) {
+               spin_lock_bh(&nf_nat_lock);
+               hlist_del_rcu(&ct->nat_bysource);
+               spin_unlock_bh(&nf_nat_lock);
+       }
 }

 static struct nf_ct_ext_type nat_extend __read_mostly = {


Am 14.11.2017 um 23:16 schrieb Sebastian Gottschall:
Am 14.11.2017 um 20:30 schrieb Guillaume Nault:
On Tue, Nov 14, 2017 at 12:05:46PM +0100, Florian Westphal wrote:
Sebastian Gottschall <s.gottschall@xxxxxxxxxx> wrote:


this should belong to you. with the revert the kernel crashes for me

[   24.838120] Unable to handle kernel NULL pointer dereference at virtual
address 00000055
[   24.846193] pgd = c0004000
[   24.848893] [00000055] *pgd=00000000
[   24.852472] Internal error: Oops - BUG: 817 [#1] PREEMPT SMP ARM
[   24.858463] Modules linked in: xhci_plat_hcd xhci_pci xhci_hcd ohci_hcd
ehci_pci ehci_platform ehci_hcd usbcore usb_common nls_base qca_ssdk
gpio_pca953x mii_gpio wil6210 ath10k_pci ath10k_core ath9k ath9k_common
ath9k_hw ath mac80211 cfg80211 compat
[   24.880852] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.9.62-rc1 #90
[   24.887189] Hardware name: AnnapurnaLabs Alpine (Device Tree)
[   24.892921] task: ef029ac0 task.stack: ef05a000
[   24.897444] PC is at nf_nat_cleanup_conntrack+0x4c/0x74
[   24.902657] LR is at nf_nat_cleanup_conntrack+0x38/0x74
[   24.907869] pc : [<c04858c8>]    lr : [<c04858b4>]    psr: 60000153
[   24.907869] sp : ef05bb58  ip : ef05bb58  fp : ef05bb6c
[   24.919317] r10: ed230cc0  r9 : ed230c00  r8 : edf45800
[   24.924529] r7 : ebcd2f00  r6 : ec33739e  r5 : c0892294 r4 : ebcd2f00 [   24.931040] r3 : 00000000  r2 : 00000055  r1 : 00000000 r0 : c0892718 [   24.937551] Flags: nZCv  IRQs on  FIQs off  Mode SVC_32 ISA ARM  Segment
user
[   24.944755] Control: 10c5387d  Table: 2bd1006a  DAC: 00000055
[   24.950486] Process swapper/2 (pid: 0, stack limit = 0xef05a210)
[   24.956477] Stack: (0xef05bb58 to 0xef05c000)
  Guillaume, can you check what I missed?
IIRC you said you had a revert for your 4.9 tree, correct?

Actually I had forgotten some details when I told you about it. What I
did is simpler than a revert. I just cherry-picked the following
commits:
124dffea9e8e ("netfilter: nat: use atomic bit op to clear the _SRC_NAT_DONE_BIT")
6e699867f84c ("netfilter: nat: avoid use of nf_conn_nat extension")
e1bf1687740c ("netfilter: nat: Revert "netfilter: nat: convert nat bysrc hash to rhashtable"")

Then I just had to fix one remaining call:
@@ -842,7 +842,7 @@ static int __init nf_nat_init(void)
         return 0;
      cleanup_extend:
-       rhltable_destroy(&nf_nat_bysource_table);
+       nf_ct_free_hashtable(nf_nat_bysource, nf_nat_htable_size);
         nf_ct_extend_unregister(&nat_extend);
         return ret;
  }


My original approach was to revert commit
870190a9ec90 ("netfilter: nat: convert nat bysrc hash to rhashtable")
and fix the conflicts by hand. But I had crashes when testing the
result. After a bit of debugging, I decided to simply try cherry-picking
commit e1bf1687740c and its dependencies. Since I got better results
with this approach, I dropped my original revert code (so,
unfortunately, I can't compare with yours and I don't remember if the
crash was similar to Sebastian's).
We have the cherry-picked patches running on several machines since
September. No problem found so far.


Now, looking at Sebastian's report, I suspect that using
nf_ct_ext_find(NF_CT_EXT_NAT) in nf_nat_cleanup_conntrack() might not
be appropriate anymore. My approach uses the IPS_SRC_NAT_DONE bit
instead, which is now set atomically. Theses changes weren't
intentional, but introduced by the dependency on commits 124dffea9e8e
and 6e699867f84c.

Does that make sense to you? I can look more deeply into this tomorrow.

i love crashing my devices. let me test this approach


Sebastian




--
Mit freundlichen Grüssen / Regards

Sebastian Gottschall / CTO

NewMedia-NET GmbH - DD-WRT
Firmensitz:  Stubenwaldallee 21a, 64625 Bensheim
Registergericht: Amtsgericht Darmstadt, HRB 25473
Geschäftsführer: Peter Steinhäuser, Christian Scheele
http://www.dd-wrt.com
email: s.gottschall@xxxxxxxxxx
Tel.: +496251-582650 / Fax: +496251-5826565




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]