[PATCH] Nvidia MCP78S OHCI USB 10de:077b,d any usb device stop responding after short time - partial solution, help needed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I have Nvidia MCP78S mainboard (Geforce 8200). There is serious problem with both OHCI usb 1.1 controllers 10de:077b, 10de:077d.
All full speed devices connected to it suddenly stop responding after random, short time.
(Low speed devices like keyboard and mouse seems to work, however moving mouse ultra fast like crazy will hang usb).
On Windows XP there is no problem - it always work. OpenSolaris also seems to be OK but did not tested it much.
I would like to fix this but need some inspiration and help from experienced Linux usb or interrupt handler developers.

First I noticed that using noapic or acpi=noirq kernel parameters gains big improvement - usb almost does not hangs.
I have checked ACPI tables and they are clean. So this is not ACPI bug or trap.
In this case both noapic and acpi=noirq does the same: they both disable APIC interrupt controller, use XT PIC instead and limit my 4 core Phenom to single core).
Knowing that replacing APIC with PIC increases usb stability from few minutes to few hours I realized that something must be 
wrong with interrupts on this MCP78S chipset. I took 2.6.33.4 kernel sources and started to play with apic code.
By looking at /proc/interrupts I see that Linux uses fasteoi interrupt handler. WindowsXP and OpenSolaris uses level.
So I patched Linux to use level interrupts instead of fasteoi. Bingo.
Now usb stability increased from few minutes to few hours plus I'm not limited to one core anymore.
The solution is not perfect. Usb will hang after few hours. Sometimes earlier or later depending on how much stress is put on usb ohci.
On WindowsXP I connected and used 8 usb 1.1 full speed devices in the same time and it is rock solid. Never hangs.
Linux will still hang with only one. But after hours not minutes. So there is place for improvement.
There is NO special driver for usb from Nvidia on Windows - it uses reference ohci driver from Windows install CD.
All other mainboard devices including EHCI - usb 2.0 controller works prefect. Only ohci is broken.
More info here: https://bugzilla.kernel.org/show_bug.cgi?id=13405

Questions to developers/users:
1. Can you confirm that usb ohci driver on Linux 2.6 is multicore/SMP safe? (I would like to know if usb ohci is rock solid on other chipsets for AMD Phenom like for example amd sb600/700/800)
2. I'm currently run out of ideas what can be wrong with this chipset interrupts. If you know something which may improve or fix this chipset bug - let me know. I would be happy testing patches or writing patches based on ideas sent by anyone.
3. I'm also interested in differences between interrupt or ohci usb handling between Linux and WindowsXP. It was right thing to restore level irq handler to recreate Windows behaviour. If there are other differences implementing them may improve usb stability in this case.
3. The difference between fasteoi and level handler is interrupt masking or using eoi. Knowing that interrupts are out of band maybe there is broken sync between memory writes and interrupts issue? Level handler is slower than fasteoi and xt pic is more slow than those two. Maybe usb ohci is too slow for these fast handlers and becomes out of sync causing usb devices timeouts.
4. Last thing which comes to my mind: maybe ohci controller forgets (or do it in lazy way) to play with interrupt registers: IMR, IRR, ISR.


Few words about included patch - it adds two new kernel boot parameters:
nofasteoiapic - uses level handler instead of fasteoi - makes usb 1.1 usable on Nvidia MCP78S chipset mainboards (I use it now)
antifasteoiapic= - replace fasteoi with level handler for given interrupt numbers (it does not work - do not know yet why)

TODO:
-add autodetection of 10de:077b, 10de:077d and auto apply level handler patch
Hard to do so early when pci access functions are not initialized yet by kernel.
ioapic_register_intr which I patch does not know anything about device pci id - only interrupt number. :(

have a nice day,
Zbigniew Luszpinski
--- linux-2.6.33.3/arch/x86/kernel/apic/io_apic.c	2010-05-01 22:21:21.000000000 +0200
+++ linux-2.6.33.3/arch/x86/kernel/apic/io_apic.c	2010-05-01 22:40:19.000000000 +0200
@@ -73,6 +73,10 @@
  */
 int sis_apic_bug = -1;
 
+bool noFastEoiHandler = 0;
+#define MAX_LVL_IRQS_NR 24
+int irq_lvl_required[MAX_LVL_IRQS_NR];
+
 static DEFINE_SPINLOCK(ioapic_lock);
 static DEFINE_SPINLOCK(vector_lock);
 
@@ -124,6 +128,27 @@
 }
 early_param("noapic", parse_noapic);
 
+static int __init parse_NoFastEoiApic(char *str)
+{
+	/* replace the default fasteoi interrupt handler with level one */
+	noFastEoiHandler = 1;
+	return 0;
+}
+early_param("nofasteoiapic", parse_NoFastEoiApic);
+
+static int __init parse_NoFastEoiApicAt(char *str)
+{
+	/* Reset level int table to default -1 */
+	int i;
+	for(i = 0; i < 24; i++) irq_lvl_required[i] = -1; 
+	/* force level handler for irqs instead default fasteoi */
+	get_options(&str, MAX_LVL_IRQS_NR, irq_lvl_required);
+	for(i = 0; i < 24; i++) apic_printk(APIC_VERBOSE, KERN_INFO
+                        "Interrupt table: position %d value %d\n", i, irq_lvl_required[i]);
+	return 0;
+}
+early_param("antifasteoiapic=", parse_NoFastEoiApicAt);
+
 struct irq_pin_list {
 	int apic, pin;
 	struct irq_pin_list *next;
@@ -1324,9 +1349,21 @@
 }
 #endif
 
-static void ioapic_register_intr(int irq, struct irq_desc *desc, unsigned long trigger)
+int CheckLevelNeeded(int irq)
 {
+/* Looks if level irq is on the list */
+	int i, result = 0;
+	for(i = 0; i < MAX_LVL_IRQS_NR; i++)
+	{
+		if(irq_lvl_required[i] == irq) result = irq;
+		if(irq_lvl_required[i] < 1) result = 0;
+	}
+	apic_printk(APIC_VERBOSE, KERN_INFO, "Interrupt found %d.\n", result);
+	return result;
+}
 
+static void ioapic_register_intr(int irq, struct irq_desc *desc, unsigned long trigger)
+{
 	if ((trigger == IOAPIC_AUTO && IO_APIC_irq_trigger(irq)) ||
 	    trigger == IOAPIC_LEVEL)
 		desc->status |= IRQ_LEVEL;
@@ -1346,10 +1383,21 @@
 	}
 
 	if ((trigger == IOAPIC_AUTO && IO_APIC_irq_trigger(irq)) ||
-	    trigger == IOAPIC_LEVEL)
+	    trigger == IOAPIC_LEVEL) {
+	   if (noFastEoiHandler)
+		set_irq_chip_and_handler_name(irq, &ioapic_chip,
+					      handle_level_irq,
+					      "level");
+	   else if (CheckLevelNeeded(irq)) {
+		set_irq_chip_and_handler_name(irq, &ioapic_chip,
+					      handle_level_irq,
+					      "level");
+			}
+	   else
 		set_irq_chip_and_handler_name(irq, &ioapic_chip,
 					      handle_fasteoi_irq,
 					      "fasteoi");
+		}
 	else
 		set_irq_chip_and_handler_name(irq, &ioapic_chip,
 					      handle_edge_irq, "edge");

Attachment: smime.p7s
Description: S/MIME cryptographic signature


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux