On Thu, Jul 26, 2012 at 11:27:47PM -0700, Sarah Sharp wrote: > Yeah, I was kind of banging my head on the desk as well. I'm really not > sure why it wasn't causing a general protection fault, so it's possible > I could be wrong in my analysis of what the bug might do. ;-) > > I did test this latest patch, and things are still haywire with the same error > > according the the fuse exfat, however there's no trace of "error" (case > > insensitive) except for some messages which say this again and again from the > > debug logic: > > > > Jul 25 01:36:08 themhallbox kernel: [ 601.092877] xhci_hcd 0000:03:00.0: HC error bitmask = 0x0 > > That message is harmless, since there's no error bits set in the > bitmask. Were you using a case insensitive search? Because > the first part of the message was "ERROR". If you were using a case > insensitive search, that means there's no xHCI problems, at least. I am pretty sure, but it never hurts to have a second set of eyes on the kern log I posted in my second mail. In the part I posted which covers the time range in question, it looks like we get some more entertaining errors. $ zfgrep -i error dmesg-usb-3-port-memory-patch-plugin.txt.gz Jul 26 22:53:28 themhallbox kernel: [31254.692933] xhci_hcd 0000:03:00.0: HC error bitmask = 0x0 Jul 26 22:54:29 themhallbox kernel: [31314.680519] xhci_hcd 0000:03:00.0: HC error bitmask = 0x4 Jul 26 22:55:29 themhallbox kernel: [31374.668080] xhci_hcd 0000:03:00.0: HC error bitmask = 0x4 Jul 26 22:56:29 themhallbox kernel: [31434.655636] xhci_hcd 0000:03:00.0: HC error bitmask = 0x4 Jul 26 22:57:29 themhallbox kernel: [31494.643194] xhci_hcd 0000:03:00.0: HC error bitmask = 0x4 Jul 26 22:58:29 themhallbox kernel: [31554.630756] xhci_hcd 0000:03:00.0: HC error bitmask = 0x4 Jul 26 22:59:29 themhallbox kernel: [31614.618308] xhci_hcd 0000:03:00.0: HC error bitmask = 0x4 Jul 26 23:00:30 themhallbox kernel: [31674.605863] xhci_hcd 0000:03:00.0: HC error bitmask = 0x4 Jul 26 23:01:30 themhallbox kernel: [31734.593415] xhci_hcd 0000:03:00.0: HC error bitmask = 0x4 Jul 26 23:01:52 themhallbox kernel: [31756.388465] end_request: I/O error, dev sdi, sector 32768 Jul 26 23:01:52 themhallbox kernel: [31756.388469] Buffer I/O error on device sdi1, logical block 0 Jul 26 23:01:52 themhallbox kernel: [31756.388472] lost page write due to I/O error on sdi1 Jul 26 23:02:30 themhallbox kernel: [31794.580971] xhci_hcd 0000:03:00.0: HC error bitmask = 0x4 Jul 26 23:03:30 themhallbox kernel: [31854.568518] xhci_hcd 0000:03:00.0: HC error bitmask = 0x4 Jul 26 23:04:30 themhallbox kernel: [31914.556076] xhci_hcd 0000:03:00.0: HC error bitmask = 0x4 Jul 26 23:05:17 themhallbox kernel: [31960.910771] end_request: I/O error, dev sdi, sector 32768 Jul 26 23:05:17 themhallbox kernel: [31960.910775] Buffer I/O error on device sdi1, logical block 0 Jul 26 23:05:17 themhallbox kernel: [31960.910778] lost page write due to I/O error on sdi1 Jul 26 23:05:30 themhallbox kernel: [31974.543623] xhci_hcd 0000:03:00.0: HC error bitmask = 0x4 Jul 26 23:06:31 themhallbox kernel: [32034.531186] xhci_hcd 0000:03:00.0: HC error bitmask = 0x4 Jul 26 23:07:31 themhallbox kernel: [32094.518739] xhci_hcd 0000:03:00.0: HC error bitmask = 0x4 Jul 26 23:08:31 themhallbox kernel: [32154.506303] xhci_hcd 0000:03:00.0: HC error bitmask = 0x4 Jul 26 23:09:31 themhallbox kernel: [32214.493865] xhci_hcd 0000:03:00.0: HC error bitmask = 0x4 Jul 26 23:10:31 themhallbox kernel: [32274.481423] xhci_hcd 0000:03:00.0: HC error bitmask = 0x4 > > But we are definitely still getting some kind of reactor meltdown on some kind > > of exfat write, probably the superblock update according to my prior code > > inspection... > > > > Jul 26 23:01:52 themhallbox kernel: [31756.388465] end_request: I/O error, dev sdi, sector 32768 > > Jul 26 23:01:52 themhallbox kernel: [31756.388469] Buffer I/O error on device sdi1, logical block 0 > > Jul 26 23:01:52 themhallbox kernel: [31756.388472] lost page write due to I/O error on sdi1 > > Man, I hope my code hasn't eaten your disk. Is there any chance you > could replace the drive in the enclosure and create a new file system to > test? This part is tricky, because I only have two of these SDXC memory cards, and I haven't got a reliable way of formatting exfat back onto one right now to be sure I get a clean run. My only Windows box is a Windows XP VirtualBox VM, because I've used Linux as my by-far primary OS since 2005 and main OS since 1996. I will try to see if I can convince XP to put a new exfat FS on there using one of the USB 2.0 ports and see how far I get. > > Also last time I was having a hard time capturing this, but I snagged it this > > time: > > > > mhall@themhallbox:~$ sudo mount /dev/sdi1 /mnt > > FUSE exfat 0.9.7 > > ERROR: fsync failed. > > Well, let me see the dmesg with CONFIG_USB_DEBUG and > CONFIG_USB_XHCI_HCD_DEBUGGING turned on, and I'll see if this is caused > by an xHCI error, or a filesystem error. I attached the forgotten kern log in the follow-on mail. > Sarah Sharp Regards, Matthew. -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html