On Mon, 6 Feb 2012, Steven Rostedt wrote: > On Mon, 2012-02-06 at 09:51 +0100, Hector Palacios wrote: > > On 02/03/2012 06:39 PM, Steven Rostedt wrote: > > > Note that you see that this causes a hang in the system if ksoftirqd is > > > a real time task. > > > > This is true. > > > > > Not to mention, that ksoftirqd spins in an infinite > > > loop if the cable isn't connected (regardless of ksoftirqd's priority). > > > > This is not true. The infinite loop is only hit when ksoftirqd is a real time task. I > > think you got confused by the different patches we tried. That dirty hack of yours > > with the workqueue was the one hanging with the cable disconnected. ;o) > > > > I didn't say it was going to hang the box, I said it was going to spin. > > With the cable disconnected, did you run top to see if ksoftirqd was > running at near 100%? It wont lock up the box because ksoftirqd is not > a real time task in mainline. NETDEV_TX_BUSY has always been a source of trouble and we carry a bunch of patches in RT which handle the obvious candidates since we encountered the first spinning lockup on RT. Mainline does not notice as it falls back to the SCHED_OTHER softirq thread after trying to reschedule the same thing over and over. NETDEV_TX_BUSY simply should die. It's a bad design decision (invented for mitigation of SMP lock contention problems) and it's abuse by driver writers to bridge the gap of hardware bringup is just a consequence of that decision. if (!fep->link) { /* Link is down or autonegotiation is in progress. */ return NETDEV_TX_BUSY; } So instead of handling link down and autonegotiation gracefully this code relies on the fact that a 2 seconds spinning loop goes unnoticed in mainline because ksoftirqd runs with SCHED_OTHER. Oh well, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html