Re: tgtd segfault with software RAID, hard resetting link

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Chris Webb <chris@xxxxxxxxxxxx> writes:

> Guessing that this bug might be the same as the one I'm seeing, I've reproduced
> it here with tgtd running under gdb. In my case, the drive actually vanished
> completely underneath the md because it got so upset! I see a null pointer
> dereference in bs_rdwr_request():
> 
>   Program received signal SIGSEGV, Segmentation fault.
>   [Switching to LWP 6871]
>   bs_rdwr_request (cmd=0x8077ce8) at bs_rdwr.c:98
>   98                              if (((cmd->scb[0] != WRITE_6) && (cmd->scb[1] & 0x8)) ||
>   (gdb) p cmd
>   $1 = (struct scsi_cmd *) 0x8077ce8
>   (gdb) p cmd->scb
>   $2 = (uint8_t *) 0x0
> 
> What's odd is that switch(cmd->scb[0]) didn't fail back on line 70, but was
> valid and equal to WRITE_* or we wouldn't have got there. length and ret are
> both 524288 here for what it's worth. I tried using a device mapper zero target
> becoming error target, but couldn't reproduce the segfault with this.
> 
> This isn't code I'm at all familiar with, so I hesitate to suggest what might
> be going on.

I've now also seen a segfault from a similar null pointer dereference at line
125, in the dprintf, following a read from a hanging md device:

  Program received signal SIGSEGV, Segmentation fault.
  [Switching to LWP 25000]
  0x08054864 in bs_rdwr_request (cmd=0x8076ae8) at bs_rdwr.c:121
  121             dprintf("io done %p %x %d %u\n", cmd, cmd->scb[0], ret, length);

  (gdb) print *cmd
  $6 = {c_target = 0x8070524, c_hlist = {next = 0x8070524, prev = 0x60070}, qlist = {next = 0xa000000, prev = 0x0}, 
    dev_id = 41, dev = 0x0, state = 0, data_dir = DATA_NONE, in_sdb = {resid = 0, length = 0, buffer = 0}, out_sdb = {
      resid = 0, length = 0, buffer = 0}, cmd_itn_id = 0, offset = 0, scb = 0x0, scb_len = 0, 
    lun = "\000\000\000\000\000\000\000", attribute = 0, tag = 0, result = 0, mreq = 0x0, 
    sense_buffer = '\0' <repeats 136 times>, "\022\000\000\000\000\000\000\000A\002\000\000\001Á\000\000\000\000\000\000\000\001\000\000\000\000\000\0005\000\000\020\000\001ð\000\000\001?\216\000\000\000\006(\000\000\211´\b\000\000ø", '\0' <repeats 55 times>, "5\000\000\020\000\000\000", sense_len = 134676660, scsi_cmd_done = 0x80702c8, bs_list = {
      next = 0x8096074, prev = 0x8076c6c}, it_nexus = 0x8076c6c, itn_lu_info = 0x8070268}

This was harder to trigger, though.

Cheers,

Chris.
--
To unsubscribe from this list: send the line "unsubscribe stgt" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux SCSI]     [Linux RAID]     [Linux Clusters]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]

  Powered by Linux