Bus/cache error on partition check

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I am trying to debug a bus/cache error hit by a 2.6.21 kernel on am
RMI XLR cpu. It only happens occasionally and could be temperature
related. I am trying to get some understanding of what the kernel is
trying to do at the point of the exception and I hope someone can shed
some light on it.

The problem appears when trying to access a SATA drive connected to
the CPU through a point to point PCI bus. The PCIX interface maps two
areas of memory, an IO area, and a memory space. Accessing the IO area
works OK always, but on the first access to the memory mapped area I
sometimes see this exception, after a pause of around 4 seconds. I
tried different drives, cables etc. to no avail.

I enclose the whole trace, messed up a bit by my own printks,
apologies for that. The cache trace epc is phoenix_wait, the xlr
cpu_wait, and the errorepc is  ioread8, to the physical address
0xf0000088 which is part of the PCIX memory space mapping.

This seems to indicate an exception on a PCI access, but I am missing
the path between the read_dev_sector() and the exception, I would have
expected to see a call to phoenix_pcibios_read() before the crash.

I am basically trying to understand what is the kernel waiting for
after the call to io_schedule, why the 4 seconds pause, and how can
that cause a cache exception error.

It's the first time I look at this part of the kernel,and I'd really
appreciate an expert opinion on what might be going on.

Regards,
Alex

-------------------------------------------------------------------------------------

0:<5>0:Linux version 2.6.21.7-37:155M (alex@alex-laptop) (gcc version
4.1.2) #17 SMP PREEMPT Fri Aug 22 10:31:31 BST 2008
0:Using default Mac address
0:Passed argv is 8c205c5c envp is 8c206164
0:Enabling XLR Linux Loader support
0:Physical Address Map of XLR:
0:      0000000000 --> 00000fffff ( Memory )
0:      0000100000 --> 000c0fffff ( Memory )
0:      000c100000 --> 000fffffff ( Memory )
0:      0010000000 --> 0013ffffff ( PCIX IO Space )
0:      0014000000 --> 0017ffffff (  *** HOLE *** )
0:      0018000000 --> 0019ffffff ( PCIX CFG Space )
0:      001a000000 --> 001bffffff (  *** HOLE *** )
0:      001c000000 --> 001dffffff ( Flash Controller Space )
0:      001e000000 --> 001eefffff (  *** HOLE *** )
0:      001ef00000 --> 001effffff ( Phoenix IO Space )
0:      001f000000 --> 001fffffff (  *** HOLE *** )
0:      0020000000 --> 003fffffff ( Memory )
0:      0040000000 --> 007fffffff ( Memory )
0:      0080000000 --> 00bfffffff ( Memory )
0:      00c0000000 --> 00dfffffff ( Memory )
0:      00e0000000 --> 00efffffff ( Memory )
0:      00f0000000 --> 00ffffffff ( PCIX Memory Space )
0:      0100000000 --> 01ffffffff ( Memory )
0:      0200000000 --> fffffffffe (  *** HOLE *** )
0:Using 0x0000000f as linux cpu mask
0:Using 0xfffffff0 as loader cpu mask
0:No KSEG args passed. Using defaults
0:Using 0x20800000 as KUSEG start and 0x5f7fffff as KUSEG size
0:phnx_loader: RMIOS-APP shared memory size=0x00200000
0:Adding 0x8400000 @ 0x3400000
0:Adding 0x800000 @ 0x20000000
0:Adding 0x1 @ 0x7fffffff
0:Adding 0x40000000 @ 0xb0000000
0:Adding 0x100000000 @ 0x100000000
0:Using bootloader passed physical memory map
0:
Callin prom_init_xlr_loader
0:argc=6, argv=8c205c5c, envp=8c206164, prom_info=8384def8
0:argv[1] = [xlr_loader]
0:arcs_cmdline=[xlr_loader ]
0:argv[2] = [root=/dev/sda1]
0:arcs_cmdline=[xlr_loader root=/dev/sda1 ]
0:argv[3] = [kuseg_start_lo=20800000]
0:arcs_cmdline=[xlr_loader root=/dev/sda1 kuseg_start_lo=20800000 ]
0:argv[4] = [kuseg_size_lo=5F7FFFFF]
0:arcs_cmdline=[xlr_loader root=/dev/sda1 kuseg_start_lo=20800000
kuseg_size_lo=5F7FFFFF ]
0:argv[5] = [linux_cpu_mask=0000000f]
0:arcs_cmdline=[xlr_loader root=/dev/sda1 kuseg_start_lo=20800000
kuseg_size_lo=5F7FFFFF linux_cpu_mask=0000000f ]
0:prom_init: envp[0] = [BOARD=ARIZONA]
0:prom_init: envp[1] = [ttyS0]
0:arcs_cmdline=[xlr_loader root=/dev/sda1 kuseg_start_lo=20800000
kuseg_size_lo=5F7FFFFF linux_cpu_mask=0000000f  console=ttyS0,38400
rdinit=/sbin/init  ]
0:MAC ADDR BASE: 00:1c:69:20:00:26
0:Master CPU Thread: 0 of 0 running on Phoenix 0
0:Initializing PIC...
0:Initializing message ring for cpu_0
0:&phnx_counters = 0x837ff000, sizeof(phnx_counters) = 0x1400
0:on_chip init done
0:CPU revision is: 000c0809
0:Determined physical RAM map:
0: memory: 08400000 @ 03400000 0:(usable)
0: memory: 00800000 @ 20000000 0:(usable)
0: memory: 00000001 @ 7fffffff 0:(usable)
0: memory: 40000000 @ b0000000 0:(usable)
0: memory: 100000000 @ 100000000 0:(usable)
0:<6>Wasting 425984 bytes for tracking 13312 unused pages
0:<7>On node 0 totalpages: 2097152
0:<7>  DMA zone: 1024 pages used for memmap
0:<7>  DMA zone: 0 pages reserved
0:<7>  DMA zone: 130048 pages, LIFO batch:31
0:<7>  Normal zone: 0 pages used for memmap
0:<7>  HighMem zone: 15360 pages used for memmap
0:<7>  HighMem zone: 1950720 pages, LIFO batch:31
0:(PROM) CPU present map: f
0:Phys CPU present map: f, possible map f
0:Detected 3 Slave CPU(s)
0:Built 1 zonelists.  Total pages: 2080768
0:<5>Kernel command line: xlr_loader root=/dev/sda1
kuseg_start_lo=20800000 kuseg_size_lo=5F7FFFFF linux_cpu_mask=0000000f
 console=ttyS0,38400  rdinit=/sbi
0:Primary instruction cache 32kB, 8-way, linesize 32 bytes.
0:Primary data cache 32kB 8-way, linesize 32 bytes.
0:PID hash table entries: 2048 (order: 11, 8192 bytes)
0:mips_hpt_frequency = 66666666
0:Using 66.667 MHz high precision timer. cycles_per_jiffy=1200000
0:phoenix_timer_setup: phoenix_timer_stats = 8382c100,
phoenix_timer_diff = 8382c180, phoenix_timer_count = 8382c200,
phoenix_timer_epc = 8382c280, phoenix_0
0:Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
0:Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
0:<6>Memory: 5314664k/135168k available (3431k kernel code, 70796k
reserved, 679k data, 156k init, 5251072k highmem)
0:<7>Calibrating delay loop... 0:266.24 BogoMIPS (lpj=133120)
0:<6>Security Framework v1.0.0 initialized
0:<6>Capability LSM initialized
0:Mount-cache hash table entries: 512
0:(PROM): waking up phys cpu# 1, bucket_1, gp = 87972000
0:(PROM): sent a wakeup message to bucket 1
1:Initializing CPU:1 (1), current=87960440, gp = 87972000
1:CPU revision is: 000c0809
1:<7>Calibrating delay loop... 1:277.50 BogoMIPS (lpj=138752)
1:<7>Calibrating delay loop... 1:276.48 BogoMIPS (lpj=138240)
0:(PROM): waking up phys cpu# 2, bucket_2, gp = 87980000
0:(PROM): sent a wakeup message to bucket 2
2:Initializing CPU:2 (2), current=8797b480, gp = 87980000
2:CPU revision is: 000c0809
2:<7>Calibrating delay loop... 2:393.21 BogoMIPS (lpj=196608)
2:<7>Calibrating delay loop... 2:394.24 BogoMIPS (lpj=197120)
0:(PROM): waking up phys cpu# 3, bucket_3, gp = 8798a000
0:(PROM): sent a wakeup message to bucket 3
3:Initializing CPU:3 (3), current=8797a868, gp = 8798a000
3:CPU revision is: 000c0809
3:<7>Calibrating delay loop... 3:595.96 BogoMIPS (lpj=297984)
3:<7>Calibrating delay loop... 3:595.96 BogoMIPS (lpj=297984)
0:<6>Brought up 4 CPUs
1:migration_cost=1:10001:
0:<6>NET: Registered protocol family 16
0:Registering XLR/XLS PCIX/PCIE Controller.
0:<4>registering PCI controller with io_map_base unset
0:Packet Vision's CMD driver version 1.1
0:<5>SCSI subsystem initialized
0:<7>libata version 2.20 loaded.
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_write
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_write
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_write
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:<6>NET: Registered protocol family 2
0:<6>Time: MIPS clocksource has been installed.
0:IP route cache hash table entries: 16384 (order: 4, 65536 bytes)
0:TCP established hash table entries: 65536 (order: 8, 1048576 bytes)
0:TCP bind hash table entries: 65536 (order: 7, 786432 bytes)
0:<6>TCP: Hash tables configured (established 65536 bind 65536)
0:<6>TCP reno registered
0:highmem bounce pool size: 64 pages
0:<6>Installing knfsd (copyright (C) 1996 okir@xxxxxxxxxxxx).
0:<6>io scheduler noop registered
0:<6>io scheduler anticipatory registered (default)
0:<6>io scheduler deadline registered
0:<6>io scheduler cfq registered
0:Registered phoenix msgring driver: major=0
0:Registered phoenix app loader driver: phnx_loader_major=245
0:<6>XLR RNG detected
0:Registered XLR tracebuffer driver with Major No. [0]
0:rmisec_init: BEGIN INIT
0: PHXSEC: rmisec_init_device
 0:Registering PHOENIX_SEC
0:PHXSEC:  Init  mmio=bef0b000   *mmio=0x0
0:phxsec_init_device: Phoenix SecEng at 0xbef0b000 (descriptors=20)
0:RMISEC:  register_chrdev() = 0
0:<6>Serial: 8250/16550 driver $Revision: 1.1.1.7.2.10 $ 2 ports, IRQ
sharing disabled
0:<6>serial8250: ttyS0 at MMIO 0x0 (irq = 17) is a 16550A
0:<6>serial8250: ttyS1 at MMIO 0x0 (irq = 18) is a 16550A
0:<6>nbd: registered device at major 43
0:sbull_open() called
0:rescan_partitions
0:check_partition1
0:check_partition2
0:<6> sbulla:0:check_partition2
0:Calling 834de1e0 with state=8b625000 and bdev=87946b80
0:msdos_partition
0:msdos_partition: Reading from bdev 87946b80
0:read_dev_sector: address_space=87946cac
0:read_cache_page: mapping=87946cac index=0 filler=834c4b7c data=00000000
0:read_cache_page1
0:read_dev_sector: page=83a15540
0:check_partition: res=0
0:check_partition3
0: unknown partition table
0:Initializing spi4 ethernet driver
0:This board does not support spi4
0:[setup_mac_spill_sizes]: max_num_desc = 512
max_frin_spill = 2048
max_frout_spill = 2048
max_class_0_spill = 512
max_class_1_spill = 512
max_class_2_spill = 512
max_class_3_spill = 512
0:Initializing gmac 0 in hybrid mode:
0:gmac_0, aux control/status reg = ffff
0:phy reported unknown mac speed, defaulting to 1000Mbps
0:configuring gmac_0 in 1000Mbps mode
0:GMAC_0 initialized as eth0
0:Initializing gmac 1 in hybrid mode:
0:gmac_1, aux control/status reg = ffff
0:phy reported unknown mac speed, defaulting to 1000Mbps
0:configuring gmac_1 in 1000Mbps mode
0:GMAC_1 initialized as eth1
0:Initializing gmac 2 in hybrid mode:
0:gmac_2, aux control/status reg = 2005
0:configuring gmac_2 in 10Mbps mode
0:GMAC_2 initialized as eth2
0:Initializing gmac 3 in hybrid mode:
0:gmac_3, aux control/status reg = 5
0:configuring gmac_3 in 10Mbps mode
0:GMAC_3 initialized as eth3
0:[user_mac_init]: user_mac_data=0x534d28, psb_shm_size=0x800000
0:Registered phoenix user_mac driver: major=0
0:<6>Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
0:<6>ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
0:Initializing Phoenix PCMCIA IDE...
0:Phoenix PCMCIA configured as IDE interface = 0
0:<7>Probing IDE interface ide0...
0:hda: TOSHIBA THNCF1G02QG, 0:CFA DISK drive
0:ide0 at 0xbd0001f1-0xbd0001f8,0xbd0003f7 on irq 210:
0:<6>hda: max request size: 128KiB
0:<6>hda: 2000880 sectors (1024 MB)0: w/2KiB Cache0:, CHS=1985/16/630:
0:rescan_partitions
0:check_partition1
0:check_partition2
0:<6> hda:0:check_partition2
0:Calling 834de1e0 with state=8b625000 and bdev=87946b80
0:msdos_partition
0:msdos_partition: Reading from bdev 87946b80
0:read_dev_sector: address_space=87946cac
0:read_cache_page: mapping=87946cac index=0 filler=834c4b7c data=00000000
0:read_cache_page1
0:read_cache_page2
0:lock_page
0:lock_page1
0:sync_page: word=83a15560
0:About to call 834b9f88
0:Before io_schedule()
0:sync_page: END
0:read_cache_page2.1
0:read_cache_page3
0:read_dev_sector: page=83a15560
0:msdos_partition1
0:msdos_partition2
0:msdos_partition3
0: hda10:msdos_partition4
0:
0:msdos_partition: END
0:check_partition: res=1
0:check_partition3
0:rescan_partitions1
0:<7>sata_sil 0000:00:01.0: version 2.1
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:phoenix_pcibios_write
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:<6>sata_sil 0000:00:01.0: Applying R_ERR on DMA activate FIS errata fix
0:phoenix_pcibios_read
0:phoenix_pcibios_read
0:<6>ata1: SATA max UDMA/100 cmd 0xc0000080 ctl 0xc000008a bmdma
0xc0000000 irq 24
0:<6>ata2: SATA max UDMA/100 cmd 0xc00000c0 ctl 0xc00000ca bmdma
0xc0000008 irq 24
0:<6>scsi0 : sata_sil
3:<6>ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
3:<6>ata1.00: ATA-7: ST3160815AS, 4.AAB, max UDMA/133
3:<6>ata1.00: 312581808 sectors, multi 0: LBA48 NCQ (depth 0/32)
3:<6>ata1.00: configured for UDMA/100
0:<6>scsi1 : sata_sil
3:<6>ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
3:<6>ata2.00: ATA-6: Super Talent Tech, NCC-0830, max UDMA/133
3:<6>ata2.00: 31977472 sectors, multi 1: LBA
3:<6>ata2.00: applying bridge limits
3:<6>ata2.00: configured for UDMA/100
0:<5>scsi 0:0:0:0: Direct-Access     ATA      ST3160815AS      4.AA
PQ: 0 ANSI: 5
0:sd_probe
0:READ_CAPACITY
0:<5>SCSI device sda: 312581808 512-byte hdwr sectors (160042 MB)
0:<5>sda: Write Protect is off
0:<7>sda: Mode Sense: 00 3a 00 00
0:<5>SCSI device sda: write cache: enabled, read cache: enabled,
doesn't support DPO or FUA
0:rescan_partitions
0:READ_CAPACITY
0:<5>SCSI device sda: 312581808 512-byte hdwr sectors (160042 MB)
0:<5>sda: Write Protect is off
0:<7>sda: Mode Sense: 00 3a 00 00
0:<5>SCSI device sda: write cache: enabled, read cache: enabled,
doesn't support DPO or FUA
0:check_partition1
0:check_partition2
0:<6> sda:0:check_partition2
0:Calling 834de1e0 with state=8b625000 and bdev=87946b80
0:msdos_partition
0:msdos_partition: Reading from bdev 87946b80
0:read_dev_sector: address_space=87946cac
0:read_cache_page: mapping=87946cac index=0 filler=834c4b7c data=00000000
0:read_cache_page1
0:read_cache_page2
0:lock_page
0:lock_page1
0:sync_page: word=83a15580
0:About to call 834b9f88
0:Before io_schedule()


( ~4secs pause )


*********************************************
cpu_0 received a bus/cache error
*********************************************
Bridge: Phys Addr = 0x0000000000, Device_AERR = 0x00000000
Bridge: The devices reporting AERR are:
CPU: (XLR specific) Cache Error log = 0x0000007800004601, Phy Addr =
0x00f0000088
CPU: epc = 0x8340e324, errorepc = 0x835bed34, cacheerr = 0x00000000
Can not handle bus/cache error - Halting cpu


[Index of Archives]     [Linux MIPS Home]     [LKML Archive]     [Linux ARM Kernel]     [Linux ARM]     [Linux]     [Git]     [Yosemite News]     [Linux SCSI]     [Linux Hams]

  Powered by Linux