2009/4/7 Данила Жукоцкий <optimusgd@xxxxxxxxx> > > Thank You, Grant, for Your simple questions! Welcome! I was just trying to point out some additional information to folks who might understand the code. I don't in this case but was looking for more clues. > > Without "allowdac" after > couple of hours testing i cannot reproduce bug! So it is my stupid > mistake, i don't understand Why i add this absolutely unusual > parameter to boot string. I thought I understood IOMMUs having written code for 4 different implementations. I don't understand why "allowdac" parameter exists. dma_mask stuff should be handling this already. Can someone explain *why* (Данила already posted the docs) this parameter exists? (ie use case that dma_mask APIs don't work.) > I'm apologize to All off You for this stupid > mindfuck. You may close the bug, because it is bug in my damn head. My first impression: the bug is either allocdma exists instead of using DMA API (See Documentation/DMA-API.txt) OR the documentation for allocdma is missing warnings about "this could break your system" and clearly specify when it should be used. hth, grant > > 2009/4/7 Данила Жукоцкий <optimusgd@xxxxxxxxx>: > > Forgot reply to all, sorry > > > > ---------- Forwarded message ---------- > > From: Данила Жукоцкий <optimusgd@xxxxxxxxx> > > Date: 2009/4/7 > > Subject: Re: [Bugme-new] [Bug 13001] New: PCI-DMA: Out of IOMMU space > > To: Grant Grundler <grundler@xxxxxxxxxx> > > > > > > Yes, in attachment clear dmesg, warnings in bugreport body > > > >>Apr 3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for 4096 bytes > >>Apr 3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for 4096 bytes > > > > I got "PCI-DMA: Out of IOMMU space" while trying write data to usb or > > sata hdd before all other error messages. After that usb and sata > > drives lost. All other noise is attempts to communicate with died > > devices. > > > >>Kernel command line: > >>mce=bootlog root=/dev/ram0 real_root=/dev/evms/root init=/linuxrc > > I'm boot from 3ware raid with evmc > >>iommu=allowdac,merge,memaper=3 > > This is from Documentation/x86/x86_64/boot-options.txt > > iommu=allowdac Im try to avoid DMA bug. May be that not need. > > allowdac Allow double-address cycle (DAC) mode, i.e. DMA >4GB. > > DAC is used with 32-bit PCI to push a 64-bit address in > > two cycles. When off all DMA over >4GB is forced through > > an IOMMU or software bounce buffering. > > merge Do scatter-gather (SG) merging. > > memaper[=<order>] Allocate an own aperture over RAM with size 32MB<<order. > > (default: order=1, i.e. 64MB) > > With default apperture, 64mb, DMA leak very fast, now i have > > memaper=5, 1 gb, becouse i must do my job and can't rollback to 2.6.28 > > due strange mysterious problem with forcedeth nics that i can't > > explain and solve. If solution for DMA leak will not be found, i'm try > > to fill bugreport about problem with nics. > > > >>3w_9xxx.use_msi=1 snd-hda-intel.enable_msi=1 doevms quiet > > I prefer use msi on that hardware. > >>... > >>Your BIOS doesn't leave a aperture memory hole > >>Please enable the IOMMU option in the BIOS setup > >>This costs you 256 MB of RAM > > > > xw9400 BIOS do not have IOMMU option in the BIOS setup. Now this costs > > me 1gb of ram > > > > Anyway, i can stable reproduce bug without all this whistlers > > > > 2009/4/7 Grant Grundler <grundler@xxxxxxxxxx>: > >> 2009/4/5 Данила Жукоцкий <optimusgd@xxxxxxxxx>: > >>> 2009/4/4 Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>: > >>>> > >>>> (switched to email. Please respond via emailed reply-to-all, not via the > >>>> bugzilla web interface). > >>>> > >>>> On Fri, 3 Apr 2009 09:30:19 GMT bugzilla-daemon@xxxxxxxxxxxxxxxxxxx wrote: > >>>> > >>>>> http://bugzilla.kernel.org/show_bug.cgi?id=13001 > >>>>> > >>>>> Summary: PCI-DMA: Out of IOMMU space > >>>>> Product: Platform Specific/Hardware > >>>>> Version: 2.5 > >>>>> Kernel Version: 2.6.29-gentoo > >>>>> Platform: All > >>>>> OS/Version: Linux > >>>>> Tree: Mainline > >>>>> Status: NEW > >>>>> Severity: normal > >>>>> Priority: P1 > >>>>> Component: x86-64 > >>>>> AssignedTo: platform_x86_64@xxxxxxxxxxxxxxxxxxxx > >>>>> ReportedBy: optimusgd@xxxxxxxxx > >>>>> Regression: Yes > >>>>> > >>>>> > >>>>> Created an attachment (id=20789) > >>>>> --> (http://bugzilla.kernel.org/attachment.cgi?id=20789) > >>>>> hwreport generated info > >>>>> > >>>>> After some IO activity the "PCI-DMA: Out of IOMMU space" message appear. > >>>>> 2.6.28-gentoo-r4 work ok, so it is regression. > >>>> > >>>> It is indeed a regression. > >>>> > >>>>> Dmesg fragments: > >>>>> > >>>>> > >>>>> Apr 3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for > >>>>> 4096 bytes > >> > >> The bug report has a "dmesg" attachment but I wasn't able to find the > >> "Out of IOMMU space" message in the dmesg. Can that be corrected? > >> I was looking for IDE/SATA errors *before* the IOMMU errors. > >> > >> But I was surprised to find these bits: > >> ... > >> Kernel command line: mce=bootlog root=/dev/ram0 > >> real_root=/dev/evms/root init=/linuxrc iommu=allowdac,merge,memaper=3 > >> 3w_9xxx.use_msi=1 snd-hda-intel.enable_msi=1 doevms quiet > >> Initializing CPU#0 > >> ... > >> Your BIOS doesn't leave a aperture memory hole > >> Please enable the IOMMU option in the BIOS setup > >> This costs you 256 MB of RAM > >> ... > >> > >> I'm not familiar with iommu= parameter nor the warning about the BIOS. > >> Any comments on that? > >> > >> thanks, > >> grant > >> > >>>>> Apr 3 13:38:46 rngmhpamd sata_nv 0000:00:05.0: PCI-DMA: Out of IOMMU space for > >>>>> 4096 bytes > >>>>> Apr 3 13:38:46 rngmhpamd ata1: EH in SWNCQ mode,QC:qc_active 0x1 sactive 0x1 > >>>>> Apr 3 13:38:46 rngmhpamd ata1: SWNCQ:qc_active 0x0 defer_bits 0x0 > >>>>> last_issue_tag 0xfafbfcfd > >>>>> Apr 3 13:38:46 rngmhpamd dhfis 0x0 dmafis 0x0 sdbfis 0x0 > >>>>> Apr 3 13:38:46 rngmhpamd ata1: ATA_REG 0x50 ERR_REG 0x0 > >>>>> Apr 3 13:38:46 rngmhpamd ata1: tag : dhfis dmafis sdbfis sacitve > >>>>> Apr 3 13:38:46 rngmhpamd ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action > >>>>> 0x6 > >>>>> Apr 3 13:38:46 rngmhpamd ata1.00: cmd 60/08:00:00:00:00/00:00:00:00:00/40 tag > >>>>> 0 ncq 4096 in > >>>>> Apr 3 13:38:46 rngmhpamd res 50/00:00:00:00:00/00:45:00:00:00/a0 Emask 0x40 > >>>>> (internal error) > >>>>> Apr 3 13:38:46 rngmhpamd ata1.00: status: { DRDY } > >>>>> Apr 3 13:38:46 rngmhpamd ata1: hard resetting link > >>>> > >>>> Are these scary-looking messages also present in 2.6.28? > >>>> > >>>> If so, perhaps the ata code is leaking DMA memory on the error-handling path? > >>>> > >>>>> Apr 3 13:38:47 rngmhpamd ata1: SATA link up 3.0 Gbps (SStatus 123 SControl > >>>>> 300) > >>>>> Apr 3 13:38:47 rngmhpamd ata1.00: configured for UDMA/100 > >>>>> Apr 3 13:38:47 rngmhpamd ata1: EH complete > >>>>> Apr 3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] 488397168 512-byte hardware > >>>>> sectors: (250 GB/232 GiB) > >>>>> Apr 3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write Protect is off > >>>>> Apr 3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 > >>>>> Apr 3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write cache: enabled, read cache: > >>>>> enabled, doesn't support DPO or FUA > >>>>> Apr 3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] 488397168 512-byte hardware > >>>>> sectors: (250 GB/232 GiB) > >>>>> Apr 3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write Protect is off > >>>>> Apr 3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 > >>>>> Apr 3 13:38:47 rngmhpamd sd 1:0:0:0: [sdb] Write cache: enabled, read cache: > >>>>> enabled, doesn't support DPO or FUA > >>>>> > >>>>> And > >>>>> > >>>>> Mar 31 20:56:18 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space > >>>>> for 4608 bytes > >>>>> Mar 31 20:56:18 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space > >>>>> for 69632 bytes > >>>>> Mar 31 20:56:48 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd > >>>>> and address 8 > >>>>> Mar 31 20:56:48 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space > >>>>> for 11776 bytes > >>>>> Mar 31 20:56:48 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space > >>>>> for 69632 bytes > >>>>> Mar 31 20:57:19 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd > >>>>> and address 8 > >>>>> Mar 31 20:57:19 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space > >>>>> for 11776 bytes > >>>>> Mar 31 20:57:19 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space > >>>>> for 69632 bytes > >>>>> Mar 31 20:57:50 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd > >>>>> and address 8 > >>>>> Mar 31 20:57:50 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space > >>>>> for 11776 bytes > >>>>> Mar 31 20:57:50 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space > >>>>> for 69632 bytes > >>>>> Mar 31 20:58:21 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd > >>>>> and address 8 > >>>>> Mar 31 20:58:21 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space > >>>>> for 11776 bytes > >>>>> Mar 31 20:58:21 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space > >>>>> for 69632 bytes > >>>>> Mar 31 20:58:52 rngmhpamd usb 1-4: reset high speed USB device using ehci_hcd > >>>>> and address 8 > >>>>> Mar 31 20:58:52 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space > >>>>> for 11776 bytes > >>>>> Mar 31 20:58:52 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU space > >>>>> for 69632 bytes > >>>>> Mar 31 20:59:01 rngmhpamd sd 8:0:0:0: [sdc] Unhandled error code > >>>>> Mar 31 20:59:01 rngmhpamd sd 8:0:0:0: [sdc] Result: hostbyte=0x07 > >>>>> driverbyte=0x00 > >>>>> Mar 31 20:59:01 rngmhpamd end_request: I/O error, dev sdc, sector 1137 > >>>>> Mar 31 20:59:01 rngmhpamd __ratelimit: 246 callbacks suppressed > >>>> > >>>> Do we have any debugging option for dumping the current PCI DMA > >>>> allocations, find out where it has all gone? > >>>> > >>>> > >>> > >>> Upgrade to 2.6.29-gentoo-r1 (2.6.29.1), problem is still here, can > >>> easyly trigger it. I boot with default apperture, 64mb, and while > >>> write to usb-hdd get this: > >>> > >>> Apr 5 14:28:56 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU > >>> space for 65536 bytes > >>> Apr 5 14:28:56 rngmhpamd ehci_hcd 0000:00:02.1: PCI-DMA: Out of IOMMU > >>> space for 65536 bytes > >>> Apr 5 14:29:27 rngmhpamd usb 1-4: reset high speed USB device using > >>> ehci_hcd and address 6 > >>> -- > >>> To unsubscribe from this list: send the line "unsubscribe linux-ide" in > >>> the body of a message to majordomo@xxxxxxxxxxxxxxx > >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>> > >> > > > > > > > > -- > > С уважением Данила Жукоцкий, системный администратор ЗАО "Роснефтегазмаш" > > > > > > > > -- > > С уважением Данила Жукоцкий, системный администратор ЗАО "Роснефтегазмаш" > > > > > > -- > С уважением Данила Жукоцкий, системный администратор ЗАО "Роснефтегазмаш" > -- > To unsubscribe from this list: send the line "unsubscribe linux-ide" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html