Re: Status update on sparc32 genirq support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

The good news:
sparc-next-2.6 with commit 4d14a459857bd151ecbd14bcd37b4628da00792b reverted
does NOT segfault. I did not apply the genirq patch yet.

The bad news:
Segfault gone, say hello to EXT2 read failure   :o(

I'll rebuild this kernel with the esp_debug.patch Sam sent a couple of days ago.


[    0.233333] esp: esp0, regs[fd00a000:fd009000] irq[36]
[    0.236666] esp: esp0 is a FAS100A, 40 MHz (ccf=0), SCSI ID 7
[    3.243333] scsi0 : esp
[    3.483332] scsi 0:0:1:0: Direct-Access     FUJITSU  MAP3735N
SUN72G  0401 PQ: 0 ANSI: 4
[    3.486666] scsi target0:0:1: Beginning Domain Validation
[    3.493332] scsi target0:0:1: FAST-10 SCSI 10.0 MB/s ST (100 ns, offset 15)
[    3.499999] scsi target0:0:1: Domain Validation skipping write tests
[    3.503332] scsi target0:0:1: Ending Domain Validation
[    3.743332] scsi 0:0:3:0: Direct-Access     FUJITSU  MAP3735N
SUN72G  0401 PQ: 0 ANSI: 4
[    3.746666] scsi target0:0:3: Beginning Domain Validation
[    3.753332] scsi target0:0:3: FAST-10 SCSI 10.0 MB/s ST (100 ns, offset 15)
[    3.756666] scsi target0:0:3: Domain Validation skipping write tests
[    3.759999] scsi target0:0:3: Ending Domain Validation
[    4.469999] esp: esp1, regs[fd00c000:fd00b000] irq[53]
[    4.473332] esp: esp1 is a FASHME, 40 MHz (ccf=0), SCSI ID 7
[    7.479999] scsi1 : esp
...
[   11.029998] sd 0:0:1:0: [sda] 143374738 512-byte logical blocks:
(73.4 GB/68.3 GiB)
[   11.033332] sd 0:0:3:0: [sdb] 143374738 512-byte logical blocks:
(73.4 GB/68.3 GiB)
[   11.036665] sd 0:0:1:0: [sda] Write Protect is off
[   11.043332] sd 0:0:1:0: [sda] Write cache: disabled, read cache:
enabled, doesn't support DPO or FUA
[   11.046665] sd 0:0:3:0: [sdb] Write Protect is off
[   11.053332] sd 0:0:3:0: [sdb] Write cache: disabled, read cache:
enabled, doesn't support DPO or FUA
[   11.066665]  sda: sda1 sda2 sda3
[   11.073332]  sdb: sdb1 sdb2 sdb3 sdb4 sdb5 sdb6 sdb7
[   11.089998] sd 0:0:1:0: [sda] Attached SCSI disk
[   11.093332] sd 0:0:3:0: [sdb] Attached SCSI disk
[   11.106665] EXT3-fs: barriers not enabled
[   11.113332] kjournald starting.  Commit interval 5 seconds
[   11.116665] EXT3-fs (sdb4): mounted filesystem with ordered data mode
[   11.119998] VFS: Mounted root (ext3 filesystem) readonly on device 8:20.
[   11.123332] Freeing unused kernel memory: 108k freed
INIT: version 2.86 booting
[   12.673332] NET: Registered protocol family 1

Gentoo Linux; http://www.gentoo.org/
 Copyright 1999-2007 Gentoo Foundation; Distributed under the GPLv2

 * Mounting proc at /proc ...                                             [ ok ]
 * Mounting sysfs at /sys ...                                             [ ok ]
 * Mounting /dev for udev ...                                             [ ok ]
...
blahblah
...
 * Checking root filesystem ...fsck.ext3: No such file or directory
while trying to open /dev/sdb4
/dev/sdb4:
The superblock could not be read or does not describe a correct ext2
filesystem.  If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>

 * Filesystem couldn't be fixed :(
         [ !! ]
Give root password for maintenance
(or type Control-D to continue):


Marcel


On Tue, Mar 8, 2011 at 12:17 PM, Marcel van Nies <morcles@xxxxxxxxx> wrote:
> Hi,
>
> 2.6.33.7 with commit 4d14a459857bd151ecbd14bcd37b4628da00792b reverted
> does not segfault.
> I also tried sparc-next-2.6, but I messed up my tree somehow. I will
> try again later.
>
> M
>
> On Tue, Mar 8, 2011 at 8:45 AM, Marcel van Nies <morcles@xxxxxxxxx> wrote:
>> Hi,
>>
>>> But first step is to get confirmation that reverting this commit
>>> indeed fixes the bug
>>
>> I'll try that.
>> M
>>
>> On Tue, Mar 8, 2011 at 8:37 AM, Marcel van Nies <morcles@xxxxxxxxx> wrote:
>>> Hi,
>>>
>>> It appears that two consecutive commits are causing problems on
>>> hyperSPARC, I noticed that too late.
>>>
>>> Commit 4d14a459857bd151ecbd14bcd37b4628da00792b (the one I reported
>>> earlier) only causes the system to hang, not panic:
>>> [   11.266665] sd 0:0:1:0: [sda] Attached SCSI disk
>>> [   11.279998] sd 0:0:3:0: [sdb] Attached SCSI disk
>>> [   11.299998] kjournald starting.  Commit interval 5 seconds
>>> [   11.303332] EXT3-fs: mounted filesystem with writeback data mode.
>>> [   11.306665] VFS: Mounted root (ext3 filesystem) readonly on device 8:20.
>>> [   11.309998] Freeing unused kernel memory: 100k freed
>>> <system hangs here - stop-A does go back to prom>
>>>
>>> and
>>> commit c658ad1b4e1520511da8323aa5e60d444cc303ed
>>> Author: David S. Miller <davem@xxxxxxxxxxxxx>
>>> Date:   Fri Dec 11 00:44:47 2009 -0800
>>>
>>>    sparc64: Add syscall tracepoint support.
>>>
>>>    Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
>>>
>>> actually makes the kernel panic:
>>> [   11.336665] Freeing unused kernel memory: 100k freed
>>> [   11.419998] Kernel panic - not syncing: Attempted to kill init!
>>> [   11.423332] [f002f5b8 : do_group_exit+0x84/0xb4 ]
>>>  [f0039490 : get_signal_to_deliver+0x338/0x35c ]
>>>  [f00124cc : do_signal+0x30/0x8f0 ]
>>>  [f0012da0 : do_notify_resume+0x14/0x38 ]
>>>  [f000fca4 : signal_p+0x14/0x24 ]
>>>  [f000edfc : srmmu_fault+0x58/0x68 ]
>>> [   11.466665] Press Stop-A (L1-A) to return to the boot prom
>>>
>>>
>>> Marcel
>>>
>>>
>>> On Tue, Mar 8, 2011 at 8:08 AM, Sam Ravnborg <sam@xxxxxxxxxxxx> wrote:
>>>> On Mon, Mar 07, 2011 at 11:01:20PM -0800, David Miller wrote:
>>>>> From: Sam Ravnborg <sam@xxxxxxxxxxxx>
>>>>> Date: Tue, 8 Mar 2011 07:00:39 +0100
>>>>>
>>>>> > Added davem...
>>>>> > We see strange SEGV faults in userspace and fail to read from ext2..
>>>>> > All on some (but not all) sparc32 boxes.
>>>>>
>>>>> I saw the original report.
>>>>>
>>>>> But reverting this commit is the wrong thing to do from what I can
>>>>> tell.
>>>>>
>>>>> Either we have:
>>>>>
>>>>> 1) A compiler code gen bug.
>>>>>
>>>>> 2) Some piece of code which is sparc32 specific is invoking memset
>>>>>    or memcpy in a way which makes assumptions which are in fact not
>>>>>    valid
>>>>>
>>>>> 3) The code change is merely making cache offsets change, masking the
>>>>>    true problem
>>>>>
>>>>> Especially in cases #2 and #3 we're just hiding a heisen-bug and
>>>>> not fixing the real problem.
>>>> Agree on this.
>>>> But first step is to get confirmation that reverting this commit
>>>> indeed fixes the bug. Then we can go hunting for 2), 3) or 1).
>>>> I hope we will find that 2) is the culprint.
>>>>
>>>>        Sam
>>>>
>>>
>>
>
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Kernel Development]     [DCCP]     [Linux ARM Development]     [Linux]     [Photo]     [Yosemite Help]     [Linux ARM Kernel]     [Linux SCSI]     [Linux x86_64]     [Linux Hams]

  Powered by Linux