Re: Since Linux 4.1: A lot of AMD-Vi IO_PAGE_FAULTs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[+cc Tejun, linux-ide]

On Thu, Jul 23, 2015 at 11:22 PM, Andreas Hartmann
<andihartmann@xxxxxxxxxx> wrote:
> On Tue, Jul 21, 2015 at 06:35PM +0200, Joerg Roedel wrote:
>> On Tue, Jul 21, 2015 at 06:20:23PM +0200, Andreas Hartmann wrote:
>>> [   48.193901] <6>[fglrx] Firegl kernel thread PID: 1840
>>> [   48.193985] <6>[fglrx] Firegl kernel thread PID: 1841
>>> [   48.194063] <6>[fglrx] Firegl kernel thread PID: 1842
>>> [   48.194172] <6>[fglrx] IRQ 28 Enabled
>>> [   48.261580] <6>[fglrx] Reserved FB block: Shared offset:0, size:1000000
>>> [   48.261586] <6>[fglrx] Reserved FB block: Unshared offset:f7b4000, size:4000
>>> [   48.261587] <6>[fglrx] Reserved FB block: Unshared offset:f7b8000, size:548000
>>> [   48.261588] <6>[fglrx] Reserved FB block: Unshared offset:3fff3000, size:d000
>>
>> From a first glance it doesn't look like an IOMMU driver issue, because
>> the addresses where the faults happen are not from the AMD IOMMU driver.
>>
>> And you have proprietary closed-source drivers loaded, can you reproduce
>> the issue without fglrx?
>
> Yes. I attached this one.
>
> Meanwhile I tested with 4.0.9, too. I wasn't able to reproduce the
> problem with this kernel even after lots of reboots (the problem w/ 4.1
> usually comes up during boot process (but not only - it can be seen
> after boot process, too)).
>
> The problem always is, that there are errors w/ one of the sata discs
> and at the same time, IO_PAGE_FAULT errors are rising as described before:
>
> [  152.533708] ata3.00: failed command: READ FPDMA QUEUED
> [  152.538102] ata3.00: failed command: READ FPDMA QUEUED
> [  152.539862] ata3.00: failed command: READ FPDMA QUEUED
> [  152.541778] ata3.00: failed command: WRITE FPDMA QUEUED
> [  152.543861] ata3.00: failed command: WRITE FPDMA QUEUED
>
> [ 5818.068050] ata2.00: failed command: WRITE FPDMA QUEUED
> [ 5818.068059] ata2.00: failed command: WRITE FPDMA QUEUED
>
> I compared dmesg from 4.1 w/ 4.0 and I realized the following *missing*
> entries in 4.1:
>
> [    0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
> [    0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
> [    0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
> [    0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] enabled)
> [    0.000000] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x04] enabled)
> [    0.000000] ACPI: LAPIC (acpi_id[0x05] lapic_id[0x05] enabled)
> [    0.000000] ACPI: LAPIC (acpi_id[0x06] lapic_id[0x06] enabled)
> [    0.000000] ACPI: LAPIC (acpi_id[0x07] lapic_id[0x07] enabled)
>
>
> What does this mean? Is there missing some part of the acpi initialization?
>
>
> Thanks for any hint as Linux 4.1 is completely unusable here with these
> errors.

This looks more like an AHCI problem than an IOMMU or PCI problem.
Seems like the device has the wrong idea about where its DMA buffers
are.  Maybe something scribbled on its command list?

>From your attachments:

# lspci -vvs 00:11.0
00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (rev 40) (prog-if 01 [AHCI
1.0])

pci 0000:00:11.0: [1002:4391] type 00 class 0x010601
ahci 0000:00:11.0: version 3.0
ahci 0000:00:11.0: AHCI 0001.0200 32 slots 6 ports 6 Gbps 0x3f impl SATA mode
ahci 0000:00:11.0: flags: 64bit ncq sntf ilck pm led clo pmp pio slum part
AMD-Vi: Event logged [IO_PAGE_FAULT device=00:11.0 domain=0x0008
address=0x40eba32100618000 flags=0x0010]
AMD-Vi: Event logged [IO_PAGE_FAULT device=00:11.0 domain=0x0008
address=0x40eba32100618040 flags=0x0010]
AMD-Vi: Event logged [IO_PAGE_FAULT device=00:11.0 domain=0x0008
address=0x0000000000000000 flags=0x0000]
AMD-Vi: Event logged [IO_PAGE_FAULT device=00:11.0 domain=0x0008
address=0x00000000000000c0 flags=0x0000]
AMD-Vi: Event logged [IO_PAGE_FAULT device=00:11.0 domain=0x0008
address=0x0000000000000040 flags=0x0000]
AMD-Vi: Event logged [IO_PAGE_FAULT device=00:11.0 domain=0x0008
address=0x00000000000001c0 flags=0x0000]
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux