Re: kernel problems with smart on LSI-92xx

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Great, that works. Thank you a lot!

./smartctl /dev/sda -dmegaraid,24 -t long
smartctl 5.41 2010-11-05 r3203 [x86_64-unknown-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

Extended Background Self Test has begun
Please wait 22 minutes for test to complete.
Estimated completion time: Thu Nov 11 17:44:50 2010

11.11.2010 20:06, BjÃrn Mork ÐÐÑÐÑ:
Bokhan Artem<aptem@xxxxxx>  writes:

Hello.

I have kernel problems when trying to run smart commands on LSI-92xx
controller.

Running self-test  on SAS disk (smartctl /dev/sda -dmegaraid,24 -t
long) with smartmontools causes kernel oops (?) (and segfault). Look
in attachment for dmesg.

strace of smartctl:

mknod("/dev/megaraid_sas_ioctl_node", S_IFCHR, makedev(251, 0)) = -1 EEXIST
(File exists)
close(4)                                = 0
munmap(0x7f4be3a9f000, 4096)            = 0
open("/dev/megaraid_sas_ioctl_node", O_RDWR) = 4
ioctl(4, MTRRIOC_SET_ENTRY, 0x7fffa574ed30) = 0
ioctl(4, MTRRIOC_SET_ENTRY, 0x7fffa574eb30) = 0
ioctl(4, MTRRIOC_SET_ENTRY<unfinished ...>
+++ killed by SIGSEGV +++


Viewing  smart info is OK (smartctl /dev/sda -dmegaraid,24 -a).
Running self-test on SATA disk on the same system is OK.

The problem is reproducible with 2.6.32 and 2.6.36 kernels.
A quick look at this reveals that smartctl will happily do a
MEGASAS_IOC_FIRMWARE ioctl with sge_count = 1 and sgl[0].iov_len = 0 if
it is sending a command with dataLen == 0 :


/* Issue passthrough scsi command to PERC5/6 controllers */
bool linux_megaraid_device::megasas_cmd(int cdbLen, void *cdb,
   int dataLen, void *data,
   int /*senseLen*/, void * /*sense*/, int /*report*/)
{
   struct megasas_pthru_frame    *pthru;
   struct megasas_iocpacket      uio;
   struct megasas_iocpacket      uio;
   int rc;

   memset(&uio, 0, sizeof(uio));
   pthru = (struct megasas_pthru_frame *)uio.frame.raw;
   pthru->cmd = MFI_CMD_PD_SCSI_IO;
   int rc;

   memset(&uio, 0, sizeof(uio));
   pthru = (struct megasas_pthru_frame *)uio.frame.raw;
   pthru->cmd = MFI_CMD_PD_SCSI_IO;
   pthru->cmd_status = 0xFF;
   pthru->scsi_status = 0x0;
   pthru->target_id = m_disknum;
   pthru->lun = 0;
   pthru->cdb_len = cdbLen;
   pthru->timeout = 0;
   pthru->flags = MFI_FRAME_DIR_READ;
   pthru->sge_count = 1;
   pthru->data_xfer_len = dataLen;
   pthru->sgl.sge32[0].phys_addr = (intptr_t)data;
   pthru->sgl.sge32[0].length = (uint32_t)dataLen;
   memcpy(pthru->cdb, cdb, cdbLen);

   uio.host_no = m_hba;
   uio.sge_count = 1;
   uio.sgl_off = offsetof(struct megasas_pthru_frame, sgl);
   uio.sgl[0].iov_base = data;
   uio.sgl[0].iov_len = dataLen;

   rc = 0;
   errno = 0;
   rc = ioctl(m_fd, MEGASAS_IOC_FIRMWARE,&uio);
   if (pthru->cmd_status || rc != 0) {
     if (pthru->cmd_status == 12) {
       return set_err(EIO, "megasas_cmd: Device %d does not exist\n", m_disknum);
     }
     return set_err((errno ? errno : EIO), "megasas_cmd result: %d.%d = %d/%d",
                    m_hba, m_disknum, errno,
                    pthru->cmd_status);
   }
   return true;
}



The kernel bug is that the zero valued sgl[0].iov_len is passed
unmodified to megasas_mgmt_fw_ioctl() which again passes it on as size
to dma_alloc_coherent():

         /*
          * For each user buffer, create a mirror buffer and copy in
          */
         for (i = 0; i<  ioc->sge_count; i++) {
                 kbuff_arr[i] = dma_alloc_coherent(&instance->pdev->dev,
                                                     ioc->sgl[i].iov_len,
                                                     &buf_handle, GFP_KERNEL);



And it looks like most (all?) of the dma_alloc_coherent()
implementations will use get_order(size) to compute the necessary
allocation. This will fail if size == 0.

On the other hand, I may have misunderstood this entirely....

But if you dare, you could try the attached patch (compile tested only
as I don't have the hardware) and see if it helps.  Let me know how it
goes, and I'll forward it to the megaraid manitainers if it really fixes
your problem.




BjÃrn


--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux