If you read comment above htb_dequeue_tree, it should be called only when it is sure that there are packets inside of the level/prio. It is known by other HTB mechanism (per-level activity lists).
Thus the bugtrap is to catch case where class was inserted into activity list because it had packets in its sub-qdisc but when we actually decide to dequeue - it has no packet. It is weird - can qdisc lose packets even when dequeue was not called ??
If you change the depth of the leave queue then it is possible to drop packets or if you completely exchange the queue. Which would also explain why the assertion only occurs when the configuration is altered.
Well, I agree that there is something wrong. Now it is neccessary to find scenario where it does happen so that it is fixed in right way. I have not much time these days to test these cases but your informations would lead to following hypothesis:
Classe's child qdisc is replaced while old one still has nonzero queue. New empty qdisc is grafted under class instead. What about attached patch (it is against my latest version so you can see offset warnings) ?
This would not work if there are several intermediates HTB queues from the device to the leave queue. In this case every queue from the queue that was changed to the root has to be notified about the change. (The setup we want to use involves such a configuration.) Maybe it is better to just deactivate a class when a dequeue from its leave failes due to a zero queue length. If you are concerned about performance then an audit process could be implemented. For example to check one leave queue every 64 packets +/- initial random offset to create some entropy similar to the maximum mount count in the ext2 filesystem. Maybe there are better ways to do this. I am not so familiar with the code.
I will make some tests with the patch tomorrow. If my theory is true then it should still help a lot.
bye, wilfried
devik
------------------------------------------------------------------------
--- sch_htb.c 2003/07/05 10:37:11 1.21
+++ sch_htb.c 2003/07/20 07:24:59
@@ -1286,6 +1286,10 @@ static int htb_graft(struct Qdisc *sch, return -ENOBUFS;
sch_tree_lock(sch);
if ((*old = xchg(&cl->un.leaf.q, new)) != NULL) {
+ /* TODO: test it */
+ if (cl->prio_activity)
+ htb_deactivate ((struct htb_sched*)sch->data,cl);
+
/* TODO: is it correct ? Why CBQ doesn't do it ? */
sch->q.qlen -= (*old)->q.qlen;
qdisc_reset(*old);