On Sun, 2008-01-27 at 01:18 +0100, Mariusz Kozlowski wrote: > Hello, > > > On a big powerpc box I got the following oops with 2.6.24-git2: > > > > sym0: <1010-66> rev 0x1 at pci 0000:d0:01.0 irq 215 > > sym0: No NVRAM, ID 7, Fast-80, LVD, parity checking > > sym0: SCSI BUS has been reset. > > scsi0 : sym-2.2.3 > > target0:0:8: FAST-40 WIDE SCSI 80.0 MB/s ST (25 ns, offset 31) > > scsi 0:0:8:0: Direct-Access IBM ST318305LC C509 PQ: 0 > > ANSI: 3 > > target0:0:8: tagged command queuing enabled, command queue depth 16. > > target0:0:8: Beginning Domain Validation > > target0:0:8: asynchronous > > target0:0:8: wide asynchronous > > target0:0:8: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 31) > > target0:0:8: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 31) > > Unable to handle kernel paging request for data at address 0x00000000 > > Faulting instruction address: 0xc000000000038460 > > cpu 0x25: Vector: 300 (Data Access) at [c00000000f567840] > > pc: c000000000038460: .memcpy+0x60/0x280 > > lr: d000000000050280: .sym_set_cam_result_error+0xfc/0x1e0 [sym53c8xx] > > sp: c00000000f567ac0 > > msr: 8000000000009032 > > dar: 0 > > dsisr: 42000000 > > current = 0xc000006d1e0af0a0 > > paca = 0xc0000000004afc00 > > pid = 0, comm = swapper > > enter ? for help > > [link register ] d000000000050280 > > .sym_set_cam_result_error+0xfc/0x1e0 [sym53c8xx] > > [c00000000f567ac0] c00000000f567b80 (unreliable) > > [c00000000f567b80] d0000000000552b8 .sym_complete_error+0x12c/0x1bc [sym53c8xx] > > [c00000000f567c20] d0000000000561a4 .sym_int_sir+0xaa4/0x1718 [sym53c8xx] > > [c00000000f567d00] d000000000057e8c .sym_interrupt+0x4e4/0x6ec [sym53c8xx] > > [c00000000f567dc0] d00000000004fdf4 .sym53c8xx_intr+0x6c/0xdc [sym53c8xx] > > [c00000000f567e50] c0000000000a83e0 .handle_IRQ_event+0x7c/0xec > > [c00000000f567ef0] c0000000000aa344 .handle_fasteoi_irq+0x130/0x1f0 > > [c00000000f567f90] c00000000002a538 .call_handle_irq+0x1c/0x2c > > [c000004d5e0b3a90] c00000000000c320 .do_IRQ+0x108/0x1d0 > > [c000004d5e0b3b20] c000000000004790 hardware_interrupt_entry+0x18/0x1c > > > > The memset() in sym_set_cam_result_error() would appear to be trashing > > the scsi_cmnd struct instead of clearing sense_buffer. > > > > Signed-off-by: Nathan Lynch <ntl@xxxxxxxxx> > > --- > > drivers/scsi/sym53c8xx_2/sym_glue.c | 2 +- > > 1 files changed, 1 insertions(+), 1 deletions(-) > > > > diff --git a/drivers/scsi/sym53c8xx_2/sym_glue.c b/drivers/scsi/sym53c8xx_2/sym_glue.c > > index 21e926d..e939f38 100644 > > --- a/drivers/scsi/sym53c8xx_2/sym_glue.c > > +++ b/drivers/scsi/sym53c8xx_2/sym_glue.c > > @@ -207,7 +207,7 @@ void sym_set_cam_result_error(struct sym_hcb *np, struct sym_ccb *cp, int resid) > > /* > > * Bounce back the sense data to user. > > */ > > - memset(&cmd->sense_buffer, 0, SCSI_SENSE_BUFFERSIZE); > > + memset(cmd->sense_buffer, 0, SCSI_SENSE_BUFFERSIZE); > > memcpy(cmd->sense_buffer, cp->sns_bbuf, > > min(SCSI_SENSE_BUFFERSIZE, SYM_SNS_BBUF_LEN)); > > #if 0 > > I was looking into it and found something interesting. What you see was just exposed by: > > commit b80ca4f7ee36c26d300c5a8f429e73372d153379 > Author: FUJITA Tomonori <tomof@xxxxxxx> > Date: Sun Jan 13 15:46:13 2008 +0900 > > [SCSI] replace sizeof sense_buffer with > > This replaces sizeof sense_buffer with SCSI_SENSE_BUFFERSIZE in > several LLDs. It's a preparation for the future changes to remove > sense_buffer array in scsi_cmnd structure. > > Signed-off-by: FUJITA Tomonori <fujita.tomonori@xxxxxxxxxxxxx> > Signed-off-by: James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> > > But this is not a simple replace sizeof sense_buffer with SCSI_SENSE_BUFFERSIZE > It has a side effect. Take a look a this portion of the commit: > > --- a/drivers/scsi/sym53c8xx_2/sym_glue.c > +++ b/drivers/scsi/sym53c8xx_2/sym_glue.c > @@ -207,10 +207,9 @@ void sym_set_cam_result_error(struct sym_hcb *np, struct sym_ccb *cp, int resid) > /* > * Bounce back the sense data to user. > */ > - memset(&cmd->sense_buffer, 0, sizeof(cmd->sense_buffer)); > + memset(&cmd->sense_buffer, 0, SCSI_SENSE_BUFFERSIZE); > memcpy(cmd->sense_buffer, cp->sns_bbuf, > - min(sizeof(cmd->sense_buffer), > - (size_t)SYM_SNS_BBUF_LEN)); > + min(SCSI_SENSE_BUFFERSIZE, SYM_SNS_BBUF_LEN)); > > Before the patch: > 1) memset sets cmd->sense_buffer pointer value to zero (only 4 bytes which is wrong) > 2) memcpy copies min(sizeof(cmd->sense_buffer), (size_t)SYM_SNS_BBUF_LEN)) which > is equal to min(4, 32) so after this only first four bytes were copied from 32 bytes > long cp->sns_bbuf into the sense_buffer This logic is wrong. There was a commit (de25deb18016f66dcdede165d07654559bb332bc) that changed scsi_cmnd.sense_buffer from a char [96] to a char *, which is actually what caused this to break. sizeof(char [96]) is 96 and for some reason in C &x->y == &x->y[0] if char y[something]. > After the patch: > 1) memset zeroes 96 bytes of cmd structure starting at offset where sense_buffer in > cmd is and results in an oops which you fixed > 2) memcpy copies min(SCSI_SENSE_BUFFERSIZE, SYM_SNS_BBUF_LEN)) is equal to > min(96, 32). cp->sns_bbuf is 32 bytes so I guess that's ok but that's different from > the code before b80ca4f7 which was wrong (copied only 4 bytes) but 'worked'. > > Someone more experienced with that code should take a look at the logic of it. James - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html