On Wed, 2006-10-18 at 15:32 -0700, Sean Bruno wrote: > On Wed, 2006-10-18 at 15:24 -0700, Sean Bruno wrote: > > I have had a tough time tracking this one down, however I can say for > > certain that the 29320 is really having trouble if a LUN is power > > cycled. > > > > I don't have access to a BUS analyzer right now, but here is my > > regression. > > > > 1. Hook an external SCSI array/disk to a 29320. > > 2. Power up SCSI array/disk > > 3. Power up PC with 29320. > > 4. When PC has booted, login and test device by creating a file > > system, eg. mkfs /dev/sda (or whatever disk the array is called on > > ur machine). > > 5. Power cycle array/disk > > 6. Retest device with another 'mkfs /dev/sda' ... panic/crash/lock-up > > ensues. > > > > > > > > This did not happen in 2.6.15.7 but did appear in 2.6.16 and higher. > > > >From 2.6.19-rc2 I at least get something from a crash without the entire > box locking up on me. > > The process tdg_2 is a 'test data generator' basically it writes data to > the scsi disk in a testable pattern that is later validated. > > ------------[ cut here ]------------ > kernel BUG at mm/slab.c:594! > invalid opcode: 0000 [#1] > SMP > Modules linked in: autofs4 hidp rfcomm l2cap bluetooth sunrpc iscsi_tcp > libiscsi scsi_transport_iscsi ipv6 video sbs i2c_ec i2c_core button > battery asus_acpi ac parport_pc lp parport snd_intel8x0 snd_ac97_codec > snd_ac97_bus sg snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq > snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm floppy snd_timer snd > soundcore snd_page_alloc serio_raw ide_cd skge cdrom pcspkr dm_snapshot > dm_zero dm_mirror dm_mod aic79xx scsi_transport_spi sd_mod scsi_mod ext3 > jbd ehci_hcd ohci_hcd uhci_hcd > CPU: 0 > EIP: 0060:[<c0169562>] Not tainted VLI > EFLAGS: 00010246 (2.6.19-rc2 #1) > EIP is at kmem_cache_free+0x29/0x6d > eax: 00000000 ebx: dffae300 ecx: dff91b80 edx: c1a00000 > esi: dffaaf80 edi: 00000000 ebp: d3f324c0 esp: d3fb9dd0 > ds: 007b es: 007b ss: 0068 > Process tdg_2 (pid: 2362, ti=d3fb9000 task=dfd6cd50 task.ti=d3fb9000) > Stack: dffae300 dffaaf80 00000000 c0154448 00000000 d3e09a80 dffaaf80 > d3e09a80 > c018bafc 00001000 00000000 c018b822 e088efa0 00001000 00000000 > 0000000a > d3fb9ef0 d43f76c8 00003000 00000000 00000001 c130cac8 00008000 > 00000000 > Call Trace: > [<c0154448>] mempool_free+0x66/0x6b > [<c018bafc>] bio_free+0x25/0x30 > [<c018b822>] bio_put+0x28/0x29 > [<e088efa0>] scsi_execute_async+0x15f/0x33d [scsi_mod] > [<e09c9913>] sg_common_write+0x704/0x772 [sg] > [<e09c9ba6>] sg_new_write+0x225/0x248 [sg] > [<e09cae45>] sg_write+0x106/0x33a [sg] > [<c016dae7>] vfs_write+0xa8/0x159 > [<c016e114>] sys_write+0x41/0x67 > [<c0103dc9>] sysenter_past_esp+0x56/0x79 > DWARF2 unwinder stuck at sysenter_past_esp+0x56/0x79 > > Leftover inexact backtrace: > > [<c031007b>] sleep_on+0x1e/0x6c > ======================= > Code: 5f c3 89 c1 8d 82 00 00 00 40 c1 e8 0c 57 89 d7 6b d0 28 03 15 00 > d6 50 c0 56 53 8b 02 f6 c4 40 74 03 8b 52 0c 8b 02 84 c0 78 08 <0f> 0b > 52 02 e6 6b 33 c0 39 4a 20 74 08 0f 0b ca 0d e6 6b 33 c0 > EIP: [<c0169562>] kmem_cache_free+0x29/0x6d SS:ESP 0068:d3fb9dd0 Does this only occur with sg or is that the only way you got a trace? In the original bug report you mentioned it occurring with mkfs, but the bug oops is from a sg request. Is tdg_2 run while the mkfs is running? - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html