2009/4/8 Данила Жукоцкий <optimusgd@xxxxxxxxx>: > Thank you for answer, Grant. I try tomorrow this patch. > > What done before you wrote me answer. I build all external non > critical to boot stuff (sata,pata,usb,firewire,network drivers) to > modules. All that stuff, without forcedeth, use 32bit dma AFAIC. > Reboot in single user mode with extremely low gart apperture, > iommu=4194304 (4mb). I presume that so small apperture help my quickly > trigger bug. Unload all modules exept ide\sata ( ahci, libata, > sata_nv) stuff and perform dd in and out sata hdd. I can read all hdd > to /dev/null without problem, but while i perform dd from raw raid > volume to sata hdd computer hung hard immediately with "kernel panic - > not syncing: dma_map_area overflow 65536 bytes". Please re-read what I wrote about sizing the IOMMU mapping resources. In other words, with IOMMU=4MB, I expect this system to panic very quickly. We've had similar problems in the parisc port in the past: http://lists.parisc-linux.org/hypermail/parisc-linux/9389.html because the default IOMMU resource size was based on phys mem. Systems with 64MB of RAM sized the space too small for basic use. > No oops, no register dump, nothing, just this string in console. This sounds like a different, secondary problem. thanks, grant > Capslock and scroll lock > leds blinking, magic sysrq not work. I reboot, remove all modules and > all sata stuff, but insert usb stuff and perform some test on usb > drive without any problem. After that i perform test without any > modules loaded to and from /dev/null and 3ware raid test partition. No > problem. So it may be sata\ide bug? Anyway, i can reproduce hung with > 1gb apperture size, but it take couple hours. > > And again. > Very strange behavior forcedeth nics. When i correctly poweroff 2.6.29 > kernel and boot 2.6.28 all my forcedeth nics down! In dmesg "no link > during initialization" without "link up", leds off. Remove\insert > modules, unplug and plug cables, warm reboots, cold reboots, poweroff > with power cord off, reseting BIOS!!!!! wont help. Only one way i can > boot 2.6.28 with working net is boot to 2.6.29 (nic up, leds light), > then hard poweroff pressing power button 5 sec, and then powerup > computer. Nic leds still light during this barbarian procedure, and > 2.6.28 kernel boot with working network. After that reboot, > poweron\poweroff and so on work as all years before 2.6.29 is released > :) And i'm not alone with that, i know 3 people with similar forcedeth > problem. One wrote this message > http://forums.gentoo.org/viewtopic-p-5594777.html#5594777 , one this > http://lkml.org/lkml/2009/1/11/403 (and return motherboard to dealer, > as i understand), an one crying me to icq. I can wrote bugreport about > this, but if you can explain me this mystic it whill be prefect, > because debugging 2.6.29 with this glitches is pain in the ass. > >>2009/4/8 Grant Grundler <grundler@xxxxxxxxxx>: >> 2009/4/7 Данила Жукоцкий <optimusgd@xxxxxxxxx>: >>> Dammit BUG is still here! >>> >>> Sorry guys, bug is STILL here, but it hard to trigger. After eight >>> night hours network and 3ware raid activity (sata drive umounted, usb >>> hdd unplugged) my dmesg full of "dma_map_sg overflow" messages. >>> Computer work all-night without any iommu options in boot string, but >>> DMA still leaking, as i see. >> >> Possibly, but not necessarily a leak though. As you've noticed, the IOMMU >> mappings is a limited resource and we can run out if too many IOs are >> in flight. I *think* a 64MB apperture means we can have 64MB of IO >> mapped at the same time - assuming we have 100% efficiency and >> no fragmentation or other forms of "unusable" space. >> >>> Anybody, who understand how IOMMU driver >>> work, can say me how i can set very small apperture size, about 4mb or >>> something, >> >> You can 't reduce the IOMMU mapping space to smaller than the >> amount of IO can possibly be in flight. A starting point to compute this: >> 1) count the number of block devices, the max size of each IO, and the >> queue depth. >> 2) plus number of NICs * RX depth * TX depth * 4k (page size) >> >> I'm ignoring pci_consistent (ie control data) data allocations. Those can be >> determined by looking at idle system after reboot. Don't know where but >> I had added debug code to parisc IOMMU code to dump that sort of info in >> the past. >> >> I'm also ignore 64-bit capable devices completely bypassing the IOMMU >> and not consuming any resources at all. >> >>> because with default 64mb apperture i get leak only after >>> couple of hours, so leak is small. >> >> For a system with a RAID device and more than 1GB of RAM, it's not that >> hard to get more than 64MB of IO in flight. Just depends on the IO >> controllers. >> >> If it's a leak, the only way to track it is to add debug code that tracks >> the IP of the caller along with each mapping. Enabling such debugging >> code will substantially change the timing of DMA map/unmap calls. >> >>> And i again not understand source >>> of leaking, it may be network, sata, usb, etc (i don't reboot computer >>> after yesterday's IO tests to from usb\sata). So how i can painless >>> catch the leak source without disabling subsystems one by one and >>> hours-long tests? It is workstation, and i must do my job all-day with >>> it. >> >> First off, you can eliminate 64-bit devices/drivers. Yinghai Lu submitted >> a patch to print the DMA mask: >> http://lkml.org/lkml/2008/10/8/274 >> >> Add that patch, rebuild the kernel and post the dmesg output would >> be a starting point. Posting "lspci -vt" and "lspci -v" would also help. >> >> FYI: PCI drivers get 32-bit dma_mask by default until the driver calls >> pci_set_dma_mask(). So inspecting drivers will tell you if they are >> actually using the IOMMU (or not if they set dma_mask to 64-bit). >> >> hth, >> grant >> >>> 2009/4/8 Grant Grundler <grundler@xxxxxxxxxx>: >>>> Anyone understand why allowdac exists or if it's buggy? >>>> >>>> thanks, >>>> grant >>>> >>>> >>>> ---------- Forwarded message ---------- >>>> From: Grant Grundler <grundler@xxxxxxxxxx> >>>> Date: 2009/4/7 >>>> Subject: Re: [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space >>>> To: Данила Жукоцкий <optimusgd@xxxxxxxxx> >>>> Cc: akpm@xxxxxxxxxxxxxxxxxxxx, linux-ide@xxxxxxxxxxxxxxx, >>>> bugme-daemon@xxxxxxxxxxxxxxxxxxx, x86@xxxxxxxxxx >>>> >>>> >>>> 2009/4/7 Данила Жукоцкий <optimusgd@xxxxxxxxx> >>>>> >>>>> Thank You, Grant, for Your simple questions! >>>> >>>> Welcome! >>>> I was just trying to point out some additional information to folks >>>> who might understand the code. I don't in this case but was looking >>>> for more clues. >>>> >>>>> >>>>> Without "allowdac" after >>>>> couple of hours testing i cannot reproduce bug! So it is my stupid >>>>> mistake, i don't understand Why i add this absolutely unusual >>>>> parameter to boot string. >>>> >>>> I thought I understood IOMMUs having written code for 4 different >>>> implementations. >>>> I don't understand why "allowdac" parameter exists. >>>> dma_mask stuff should be handling this already. >>>> Can someone explain *why* (Данила already posted the docs) this >>>> parameter exists? >>>> (ie use case that dma_mask APIs don't work.) >>>> >>>>> I'm apologize to All off You for this stupid >>>>> mindfuck. You may close the bug, because it is bug in my damn head. >>>> >>>> My first impression: the bug is either allocdma exists instead of using >>>> DMA API (See Documentation/DMA-API.txt) OR the documentation for >>>> allocdma is missing warnings about "this could break your system" >>>> and clearly specify when it should be used. >>>> >>>> hth, >>>> grant >>>> >>>>> >>>>> 2009/4/7 Данила Жукоцкий <optimusgd@xxxxxxxxx>: >>>>> > Forgot reply to all, sorry >>>>> > >>>>> > ---------- Forwarded message ---------- >>>>> > From: Данила Жукоцкий <optimusgd@xxxxxxxxx> >>>>> > Date: 2009/4/7 >>>>> > Subject: Re: [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space >>>>> > To: Grant Grundler <grundler@xxxxxxxxxx> >>>>> > >>>>> > >>>>> > Yes, in attachment clear dmesg, warnings in bugreport body >>>>> > >>>>> >>Apr 3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for 4096 bytes >>>>> >>Apr 3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for 4096 bytes >>>>> > >>>>> > I got "PCI-DMA: Out of IOMMU space" while trying write data to usb or >>>>> > sata hdd before all other error messages. After that usb and sata >>>>> > drives lost. All other noise is attempts to communicate with died >>>>> > devices. >>>>> > >>>>> >>Kernel command line: >>>>> >>mce=bootlog root=/dev/ram0 real_root=/dev/evms/root init=/linuxrc >>>>> > I'm boot from 3ware raid with evmc >>>>> >>iommu=allowdac,merge,memaper=3 >>>>> > This is from Documentation/x86/x86_64/boot-options.txt >>>>> > iommu=allowdac Im try to avoid DMA bug. May be that not need. >>>>> > allowdac Allow double-address cycle (DAC) mode, i.e. DMA >4GB. >>>>> > DAC is used with 32-bit PCI to push a 64-bit address in >>>>> > two cycles. When off all DMA over >4GB is forced through >>>>> > an IOMMU or software bounce buffering. >>>>> > merge Do scatter-gather (SG) merging. >>>>> > memaper[=<order>] Allocate an own aperture over RAM with size 32MB<<order. >>>>> > (default: order=1, i.e. 64MB) >>>>> > With default apperture, 64mb, DMA leak very fast, now i have >>>>> > memaper=5, 1 gb, becouse i must do my job and can't rollback to 2.6.28 >>>>> > due strange mysterious problem with forcedeth nics that i can't >>>>> > explain and solve. If solution for DMA leak will not be found, i'm try >>>>> > to fill bugreport about problem with nics. >>>>> > >>>>> >>3w_9xxx.use_msi=1 snd-hda-intel.enable_msi=1 doevms quiet >>>>> > I prefer use msi on that hardware. >>>>> >>... >>>>> >>Your BIOS doesn't leave a aperture memory hole >>>>> >>Please enable the IOMMU option in the BIOS setup >>>>> >>This costs you 256 MB of RAM >>>>> > >>>>> > xw9400 BIOS do not have IOMMU option in the BIOS setup. Now this costs >>>>> > me 1gb of ram >>>>> > >>>>> > Anyway, i can stable reproduce bug without all this whistlers >>>>> > >>>>> > 2009/4/7 Grant Grundler <grundler@xxxxxxxxxx>: >>>>> >> 2009/4/5 Данила Жукоцкий <optimusgd@xxxxxxxxx>: >>>>> >>> 2009/4/4 Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>: >>>>> >>>> >>>>> >>>> (switched to email. Please respond via emailed reply-to-all, not via the >>>>> >>>> bugzilla web interface). >>>>> >>>> >>>>> >>>> On Fri, 3 Apr 2009 09:30:19 GMT bugzilla-daemon@xxxxxxxxxxxxxxxxxxx wrote: >>>>> >>>> >>>>> >>>>> http://bugzilla.kernel.org/show_bug.cgi?id=13001 >>>>> >>>>> >>>>> >>>>> Summary: PCI-DMA: Out of IOMMU space >>>>> >>>>> Product: Platform Specific/Hardware >>>>> >>>>> Version: 2.5 >>>>> >>>>> Kernel Version: 2.6.29-gentoo >>>>> >>>>> Platform: All >>>>> >>>>> OS/Version: Linux >>>>> >>>>> Tree: Mainline >>>>> >>>>> Status: NEW >>>>> >>>>> Severity: normal >>>>> >>>>> Priority: P1 >>>>> >>>>> Component: x86-64 >>>>> >>>>> AssignedTo: platform_x86_64@xxxxxxxxxxxxxxxxxxxx >>>>> >>>>> ReportedBy: optimusgd@xxxxxxxxx >>>>> >>>>> Regression: Yes >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Created an attachment (id=20789) >>>>> >>>>> --> (http://bugzilla.kernel.org/attachment.cgi?id=20789) >>>>> >>>>> hwreport generated info >>>>> >>>>> >>>>> >>>>> After some IO activity the "PCI-DMA: Out of IOMMU space" message appear. >>>>> >>>>> 2.6.28-gentoo-r4 work ok, so it is regression. >>>>> >>>> >>>>> >>>> It is indeed a regression. >>>>> >>>> >>>>> >>>>> Dmesg fragments: >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Apr 3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for >>>>> >>>>> 4096 bytes >>>>> >> >>>>> >> The bug report has a "dmesg" attachment but I wasn't able to find the >>>>> >> "Out of IOMMU space" message in the dmesg. Can that be corrected? >>>>> >> I was looking for IDE/SATA errors *before* the IOMMU errors. >>>>> >> >>>>> >> But I was surprised to find these bits: >>>>> >> ... >>>>> >> Kernel command line: mce=bootlog root=/dev/ram0 >>>>> >> real_root=/dev/evms/root init=/linuxrc iommu=allowdac,merge,memaper=3 >>>>> >> 3w_9xxx.use_msi=1 snd-hda-intel.enable_msi=1 doevms quiet >>>>> >> Initializing CPU#0 >>>>> >> ... >>>>> >> Your BIOS doesn't leave a aperture memory hole >>>>> >> Please enable the IOMMU option in the BIOS setup >>>>> >> This costs you 256 MB of RAM >>>>> >> ... >>>>> >> >>>>> >> I'm not familiar with iommu= parameter nor the warning about the BIOS. >>>>> >> Any comments on that? >>>>> >> >>>>> >> thanks, >>>>> >> grant >>>>> >> >>>>> >>>>> Apr 3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for >>>>> >>>>> 4096 bytes >>>>> >>>>> Apr 3 13:38:46 rngmhpamd ata1: EH in SWNCQ mode,QC:qc_active 0x1 sactive 0x1 >>>>> >>>>> Apr 3 13:38:46 rngmhpamd ata1: SWNCQ:qc_active 0x0 defer_bits 0x0 >>>>> >>>>> last_issue_tag 0xfafbfcfd >>>>> >>>>> Apr 3 13:38:46 rngmhpamd dhfis 0x0 dmafis 0x0 sdbfis 0x0 >>>>> >>>>> Apr 3 13:38:46 rngmhpamd ata1: ATA_REG 0x50 ERR_REG 0x0 >>>>> >>>>> Apr 3 13:38:46 rngmhpamd ata1: tag : dhfis dmafis sdbfis sacitve >>>>> >>>>> Apr 3 13:38:46 rngmhpamd ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action >>>>> >>>>> 0x6 >>>>> >>>>> Apr 3 13:38:46 rngmhpamd ata1.00: cmd 60/08:00:00:00:00/00:00:00:00:00/40 tag >>>>> >>>>> 0 ncq 4096 in >>>>> >>>>> Apr 3 13:38:46 rngmhpamd res 50/00:00:00:00:00/00:45:00:00:00/a0 Emask 0x40 >>>>> >>>>> (internal error) >>>>> >>>>> Apr 3 13:38:46 rngmhpamd ata1.00: status: { DRDY } >>>>> >>>>> Apr 3 13:38:46 rngmhpamd ata1: hard resetting link >>>>> >>>> >>>>> >>>> Are these scary-looking messages also present in 2.6.28? >>>>> >>>> >>>>> >>>> If so, perhaps the ata code is leaking DMA memory on the error-handling path? >>>>> >>>> >>>>> >>>>> Apr 3 13:38:47 rngmhpamd ata1: SATA link up 3.0 Gbps (SStatus 123 SControl >>>>> >>>>> 300) >>>>> >>>>> Apr 3 13:38:47 rngmhpamd ata1.00: configured for UDMA/100 >>>>> >>>>> Apr 3 13:38:47 rngmhpamd ata1: EH complete >>>>> >>>>> Apr 3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] 488397168 512-byte hardware >>>>> >>>>> sectors: (250 GB/232 GiB) >>>>> >>>>> Apr 3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write Protect is off >>>>> >>>>> Apr 3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 >>>>> >>>>> Apr 3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write cache: enabled, read cache: >>>>> >>>>> enabled, doesn't support DPO or FUA >>>>> >>>>> Apr 3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] 488397168 512-byte hardware >>>>> >>>>> sectors: (250 GB/232 GiB) >>>>> >>>>> Apr 3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write Protect is off >>>>> >>>>> Apr 3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 >>>>> >>>>> Apr 3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write cache: enabled, read cache: >>>>> >>>>> enabled, doesn't support DPO or FUA >>>>> >>>>> >>>>> >>>>> And >>>>> >>>>> >>>>> >>>>> Mar 31 20:56:18 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space >>>>> >>>>> for 4608 bytes >>>>> >>>>> Mar 31 20:56:18 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space >>>>> >>>>> for 69632 bytes >>>>> >>>>> Mar 31 20:56:48 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd >>>>> >>>>> and address 8 >>>>> >>>>> Mar 31 20:56:48 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space >>>>> >>>>> for 11776 bytes >>>>> >>>>> Mar 31 20:56:48 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space >>>>> >>>>> for 69632 bytes >>>>> >>>>> Mar 31 20:57:19 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd >>>>> >>>>> and address 8 >>>>> >>>>> Mar 31 20:57:19 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space >>>>> >>>>> for 11776 bytes >>>>> >>>>> Mar 31 20:57:19 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space >>>>> >>>>> for 69632 bytes >>>>> >>>>> Mar 31 20:57:50 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd >>>>> >>>>> and address 8 >>>>> >>>>> Mar 31 20:57:50 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space >>>>> >>>>> for 11776 bytes >>>>> >>>>> Mar 31 20:57:50 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space >>>>> >>>>> for 69632 bytes >>>>> >>>>> Mar 31 20:58:21 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd >>>>> >>>>> and address 8 >>>>> >>>>> Mar 31 20:58:21 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space >>>>> >>>>> for 11776 bytes >>>>> >>>>> Mar 31 20:58:21 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space >>>>> >>>>> for 69632 bytes >>>>> >>>>> Mar 31 20:58:52 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd >>>>> >>>>> and address 8 >>>>> >>>>> Mar 31 20:58:52 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space >>>>> >>>>> for 11776 bytes >>>>> >>>>> Mar 31 20:58:52 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space >>>>> >>>>> for 69632 bytes >>>>> >>>>> Mar 31 20:59:01 rngmhpamd sd 8:0:0:0: [sdc] Unhandled error code >>>>> >>>>> Mar 31 20:59:01 rngmhpamd sd 8:0:0:0: [sdc] Result: hostbyte=0x07 >>>>> >>>>> driverbyte=0x00 >>>>> >>>>> Mar 31 20:59:01 rngmhpamd end_request: I/O error, dev sdc, sector 1137 >>>>> >>>>> Mar 31 20:59:01 rngmhpamd __ratelimit: 246 callbacks suppressed >>>>> >>>> >>>>> >>>> Do we have any debugging option for dumping the current PCI DMA >>>>> >>>> allocations, find out where it has all gone? >>>>> >>>> >>>>> >>>> >>>>> >>> >>>>> >>> Upgrade to 2.6.29-gentoo-r1 (2.6.29.1), problem is still here, can >>>>> >>> easyly trigger it. I boot with default apperture, 64mb, and while >>>>> >>> write to usb-hdd get this: >>>>> >>> >>>>> >>> Apr 5 14:28:56 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU >>>>> >>> space for 65536 bytes >>>>> >>> Apr 5 14:28:56 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU >>>>> >>> space for 65536 bytes >>>>> >>> Apr 5 14:29:27 rngmhpamd usb 1-4: reset high speed USB device using >>>>> >>> ehci_hcd and address 6 >>>>> >>> -- >>>>> >>> To unsubscribe from this list: send the line "unsubscribe linux-ide" in >>>>> >>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>>> >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> >>> >>>>> >> >>>>> > >>>>> > >>>>> > >>>>> > -- >>>>> > С уважением Данила Жукоцкий, системный администратор ЗАО "Роснефтегазмаш" >>>>> > >>>>> > >>>>> > >>>>> > -- >>>>> > С уважением Данила Жукоцкий, системный администратор ЗАО "Роснефтегазмаш" >>>>> > >>>>> >>>>> >>>>> >>>>> -- >>>>> С уважением Данила Жукоцкий, системный администратор ЗАО "Роснефтегазмаш" >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-ide" in >>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>> >>> >>> >>> -- >>> С уважением Данила Жукоцкий, системный администратор ЗАО "Роснефтегазмаш" >>> >> > > > > -- > С уважением Данила Жукоцкий, системный администратор ЗАО "Роснефтегазмаш" > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html