> Hello Nebojša, > > I have a similar problem now with 3.2.51-rt72. Did > you find any solution? > > regards, gerhard > > > -----Ursprüngliche Nachricht----- > > Von: linux-rt-users-owner@xxxxxxxxxxxxxxx > > [mailto:linux-rt-users-owner@xxxxxxxxxxxxxxx] Im Auftrag von > > Nebojša Cosic > > Gesendet: Dienstag, 30. April 2013 19:27 > > An: Carsten Emde > > Cc: linux-rt-users > > Betreff: Re: UDP jitter > > > > > > > Hi Nebojša, > > Hi Carsten > > > > > > > I am doing some work on a product running kernel 2.6.33.7.2-rt30. > > > > Applications running on this kernel are a bit specific, > > meaning that > > > > there are a number of threads running on a different priorities. > > > > For a several months I was haunted with spurious jitter, > > detected on > > > > UDP messages - multicast UDP messages where received on > > originating > > > > node without any delay, but on other nodes a delay in > > range of 10s > > > > of milliseconds was detected. Simply, it looked like a > > message was > > > > stuck in kernel before finally getting transmitted. > > > > Finally, thanks to LTTng tool, I was able to locate the > > problem down > > > > to this peace of code in net/sched/sch_generic.c: > > > > > > > > int sch_direct_xmit(struct sk_buff *skb, struct Qdisc *q, > > > > struct net_device *dev, struct > > netdev_queue *txq, > > > > spinlock_t *root_lock) { > > > > int ret = NETDEV_TX_BUSY; > > > > > > > > /* And release qdisc */ > > > > spin_unlock(root_lock); > > > > > > > > HARD_TX_LOCK(dev, txq); > > > > > > > > if (!netif_tx_queue_stopped(txq) && > > !netif_tx_queue_frozen(txq)) > > > > ret = dev_hard_start_xmit(skb, dev, txq); > > > > > > > > > > > > HARD_TX_UNLOCK(dev, txq); > > > > > > > > spin_lock(root_lock); > > > > ... > > > > > > > > When transmit queue is empty, thread wanting to send a > > message comes > > > > directly to sch_direct_xmit, without changing context. It then > > > > releases spin lock, and than takes another. So far so good. > > > > If this starting thread is of lower priority, it can be > > preempted by > > > > another thread, while still being in dev_hard_start_xmit function > > > > This thread will check if HARD_TX_LOCK is taken, and if so, go on > > > > and queue its own message. > > > > If there are enough higher priority tasks, tx can be stalled > > > > indefinitely. [..] > > > Did you increase the priority of the related sirq-net-tx and > > > sirq-net-rx kernel threads appropriately? Some more details on > > > enabling real-time Ethernet are given here -> > > https://www.osadl.org/?id=930. > > Thanks for the link, I was aware of it. > > I did try to increase sirq-net-tx and rx, even to get tx > > higher than rx (in case incoming traffic was creating > > problems), but it didn't make any difference. > > I was trying to isolate problem by running iperf, but it > > worked perfectly well when run on it's own. No wonder, > > because it generates traffic from the same priority, and to > > trigger this behaviour, one need traffic from at least two > > levels of priority, and a busy CPU (so that low priority > > thread can get blocked in driver for a noticeable period of time ). > > I suppose that running two iperf processes at different > > priorities can demonstrate the problem. > > > > > > > > -Carsten. > > > -- > > > To unsubscribe from this list: send the line "unsubscribe > > > linux-rt-users" in the body of a message to > > majordomo@xxxxxxxxxxxxxxx > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > -- > > Nebojša > > -- > > To unsubscribe from this list: send the line "unsubscribe > > linux-rt-users" in the body of a message to > > majordomo@xxxxxxxxxxxxxxx More majordomo info at > > http://vger.kernel.org/majordomo-info.html > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html You can try with this patch. I am quite sure that same problem persists on all newer kernels (I am using 2.6.33), but never had a time to create simple test to prove it. Index: net/sched/sch_generic.c =================================================================== --- net/sched/sch_generic.c (revision 1709) +++ net/sched/sch_generic.c (revision 1710) @@ -120,16 +120,18 @@ int ret = NETDEV_TX_BUSY; /* And release qdisc */ - spin_unlock(root_lock); +/* spin_unlock(root_lock); HARD_TX_LOCK(dev, txq); +*/ if (!netif_tx_queue_stopped(txq) && !netif_tx_queue_frozen(txq)) ret = dev_hard_start_xmit(skb, dev, txq); +/* HARD_TX_UNLOCK(dev, txq); spin_lock(root_lock); - +*/ if (dev_xmit_complete(ret)) { /* Driver sent out skb successfully or skb was consumed */ ret = qdisc_qlen(q); Another way to work around problem is to use user space daemon (zeromq, for example) as a network scheduler, and allow communication only from that daemon (which can have as high priority as you need it). -- Nebojša -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html