Re: [LARTC] 2.4.20 htb3 oops

Linux Advanced Routing and Traffic Control

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi everyone,

I am having problems with "oopses" since I introduced HTB on my
company's PC-based routers. It seems that only routers with high 
network load are affected. The average network load on the two most
problematic routers are 10Mbps in/out and 2.5Mbps in/out.
The other machines with less than 1Mbps average traffic seems unaffected.
 
We have been getting oopses on these machines 1-3 times per week.

We have tried to replace the hardware on both machines without any
improvement. We are using the same combination of hardware and kernel in
the same physical location without any problems, so we assume that hardware,
kernel or heat is not the problem here.
Machines with high network load that does not have any HTB rules loaded
do not suffer from this problem.

Hardware info:
  Router 1 (10Mbps avg in/out):
    1 x Intel(R) Celeron(R) CPU 1.80GHz
    256MB RAM
    eth0: Intel Corp. 82801BD PRO/100 VE (CNR)
    eth1: RealTek RTL8139

  Router 2: (2.5Mbit avg in/out):
    1 x Intel(R) Celeron(R) CPU 1.70GHz
    128MB RAM
     eth0: RealTek RTL8139
     eth1: RealTek RTL8139

Both use Linux kernel 2.4.20 with patches for FreeS/WAN and connection-
tracking of GRE/PPTP connections. They are both single processor machines.
They both shape traffic from and to a VLAN interface. The kernel is compiled
for CPU type "Pentium-III/Celeron" but the machines are running on
Pentium-IV/Celeron processors, if that matters. Router 1 were using a P3 CPU
before we replaced the hardware, and we had the same problem back then.

Unfortunately I have not been able to gather any output from the consoles of
the crasched machines.

Here is the script the ruleset script:
#!/bin/sh
for DEV in eth0.123 eth1
do
        tc qdisc del dev $DEV root
        tc qdisc add dev $DEV root handle 1: htb
        # Total
        tc class add dev $DEV parent 1:0 classid 1:1 htb rate 12Mbit
        # Default class
        tc class add dev $DEV parent 1:1 classid 1:2 htb rate 11Mbit
        # Filesharing traffic
        tc class add dev $DEV parent 1:1 classid 1:3 htb rate 512Kbit
        # ICMP (Highest priority - on customer's request, not ours)
        tc class add dev $DEV parent 1:1 classid 1:4 htb rate 512Kbit \
prio 0
        tc qdisc add dev $DEV parent 1:2 handle 2: sfq
        tc qdisc add dev $DEV parent 1:3 handle 3: sfq
        tc qdisc add dev $DEV parent 1:4 handle 4: sfq
        for PORT in 411 412 413 4661 4662 8081 19114 6340 6341 6342 \
6343 6344 6345 6346 6347 6348 6349 1214 1215 6699 6257 7668
        do
                # Send to "crap-class"
                tc filter add dev $DEV protocol ip parent 1: prio 1 u32 \
match ip sport $PORT 0xffff flowid 1:3
                tc filter add dev $DEV protocol ip parent 1: prio 1 u32 \
match ip dport $PORT 0xffff flowid 1:3
        done
        tc filter add dev $DEV protocol ip parent 1: prio 1 u32 match ip \
protocol 1 0xff flowid 1:4 # ICMP
        tc filter add dev $DEV protocol ip parent 1: prio 2 u32 match ip \
protocol 0 0x00 flowid 1:2 # Everything else
done

I have not tried to apply the HTB patches from the latest prepatch
version of the Linux kernel or the "htb_3.7_delay_bug" patch
(I think they do the same thing?). Maybe I should try that?

I can get more information (like kernel config etc.) if anyone needs it,
but this thing is really hard to debug since it only happens sporadically.

Thanks,
Göran

>
> In my SMP system (2xp3) I had also oops (2.4.19 and 2.4.20), but
> on single processor systems everything is OK.
>


[Index of Archives]     [LARTC Home Page]     [Netfilter]     [Netfilter Development]     [Network Development]     [Bugtraq]     [GCC Help]     [Yosemite News]     [Linux Kernel]     [Fedora Users]
  Powered by Linux