Hi All, Below is the patch details: Problem description: -------------------- The problem reported by one of the customer was when a logical array is deleted(from the SDK, from the GUI, from arcconf) then the corresponding physical device (/dev/sdb, for example) is not removed from the Linux namespace. So you end up with a "dead" device entry. And some of the linux tools go slightly wonky. Solution: --------- Based on the notification from FW, the driver calls "scsi_remove_device" for the DELETED drive. This call not only informs the scsi device status to the SCSI mid layer and also it will remove corresponding scsi device entries from the Linux sysfs. Please see patch below. ===================== Mahesh Linux Driver Development Engineer, Adaptec ODC, Bangalore This attached patch is against current scsi-misc-2.6. ObligatoryDisclaimer: Please accept my condolences regarding Outlook's handling of patch attachments (inline gets damaged, please use attachment). Signed-off-by: Mahesh Rajashekhara <aacraid@xxxxxxxxxxx> drivers/scsi/aacraid/aacraid.h | 4 +++- drivers/scsi/aacraid/commsup.c | 8 ++++++++ 2 files changed, 11 insertions(+), 1 deletion(-) diff -ru a/drivers/scsi/aacraid/aacraid.h b/drivers/scsi/aacraid/aacraid.h --- a/drivers/scsi/aacraid/aacraid.h 2010-05-13 13:11:54.752924936 -0700 +++ b/drivers/scsi/aacraid/aacraid.h 2010-05-13 13:15:41.923389776 -0700 @@ -12,7 +12,7 @@ *----------------------------------------------------------------------------*/ #ifndef AAC_DRIVER_BUILD -# define AAC_DRIVER_BUILD 24702 +# define AAC_DRIVER_BUILD 26400 # define AAC_DRIVER_BRANCH "-ms" #endif #define MAXIMUM_NUM_CONTAINERS 32 @@ -26,6 +26,8 @@ #define AAC_MAX_HOSTPHYSMEMPAGES (0xfffff) #define AAC_MAX_32BIT_SGBCOUNT ((unsigned short)256) +#define AAC_DEBUG_INSTRUMENT_AIF_DELETE + /* * These macros convert from physical channels to virtual channels */ diff -ru a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c --- a/drivers/scsi/aacraid/commsup.c 2010-05-13 13:11:54.752924936 -0700 +++ b/drivers/scsi/aacraid/commsup.c 2010-05-13 13:19:45.767319864 -0700 @@ -1123,6 +1123,9 @@ if (device) { switch (device_config_needed) { case DELETE: +#if (defined(AAC_DEBUG_INSTRUMENT_AIF_DELETE)) + scsi_remove_device(device); +#else if (scsi_device_online(device)) { scsi_device_set_state(device, SDEV_OFFLINE); sdev_printk(KERN_INFO, device, @@ -1131,6 +1134,7 @@ "array deleted" : "enclosure services event"); } +#endif break; case ADD: if (!scsi_device_online(device)) { @@ -1145,12 +1149,16 @@ case CHANGE: if ((channel == CONTAINER_CHANNEL) && (!dev->fsa_dev[container].valid)) { +#if (defined(AAC_DEBUG_INSTRUMENT_AIF_DELETE)) + scsi_remove_device(device); +#else if (!scsi_device_online(device)) break; scsi_device_set_state(device, SDEV_OFFLINE); sdev_printk(KERN_INFO, device, "Device offlined - %s\n", "array failed"); +#endif break; } scsi_rescan_device(&device->sdev_gendev); Regards - Mahesh Rajashekhara -----Original Message----- From: James Bottomley [mailto:James.Bottomley@xxxxxxx] Sent: Sunday, May 02, 2010 12:33 AM To: Rajashekhara, Mahesh Cc: linux-scsi@xxxxxxxxxxxxxxx; AACRAID Subject: Re: AACRAID driver update patch version - 26400 On Wed, 2010-04-07 at 20:55 -0700, Rajashekhara, Mahesh wrote: > (Resending in plain text as the earlier email got bounced) > > Hi All, > > We have prepared driver patch from latest driver source. > > We worked on a issue related to SMARTCTL([Bugme-new] [Bug 15623] New: doesn't allow SMART-passthrough on Adaptec Series 5 controller) > and found the root cause of this problem. We have responded to this bug, stating that we will be releasing the driver patch for addressing this problem. > Also, in our recent program releases we have addressed some more issues. > > Please look into the release notes. Please, no. What I actually need is a set of patches (one per change or feature) with a properly added description and signoff as described in Documentation/SubmittingPatches. > Please find the attachments. > 1. aacraid-version-26400-040510-patch - Driver patch > 2. aacraid-scsi-misc-2.6-warnings.txt - Warning message text > 3. Release notes So if I look at the Changelog it seems that there should be four separate changes: > AACRAID Driver for Linux - version 26400 > ---------------------------------------- > > Addressed issues: > ---------------- > 1. The default driver setting is "expose_physicals=0", which means raw > physical drives not exposed to OS. > If the user wants to expose connected physical drives, enable > "expose_physicals" module parameter. > With the new JBOD firmware, physical drives are not available for > "expose_physicals>0". > In function "aac_expose_phy_device", added to reset the appropriate > bit in the first byte of inquiry data. > This fix exposes the connected physical drives. OK > 2. Added support for handling ATA pass-through commands. > When the "CC" bit is set by the host in ATA pass-through CDB, > driver is supposed to return DID_OK > When the "CC" bit is reset by the host, driver should return > DID_ERROR No, that's not correct. You should return a result code of SAM_STAT_CHECK_CONDITION, with a driver byte of DID_OK. If you return DID_ERROR, it will unconditionally trigger error handling and a retry. > 3. Firmware exposes lesser container capacity than the actual one. > It exposes [actaul size - hidden reserved space(10MB)] to the OS, > IO's to the 10MB should be prohibhited from the Linux > driver. > When the IO's falls into hidden reserved space, driver internally > sets sensekey to HARDWARE_ERROR and sends notification to the SCSI mid > layer. > Added "expose_hidden_space" flag, by default the fix will be > executed. > Only if the user sets "expose_hidden_space=1", user can access > beyond the array reported size(hidden reserved space 10MB). This doesn't exactly look like the right interface. Won't the user want the hidden space exposed on a per array basis? In which case a sysfs flag in the host attributes would be far more appropriate than a module flag? > 4. Based on the notification from firmware, the driver sets up > SDV_OFFLINE flag for a deleted array and > the SCSI mid layer comes to know the attached SCSI device has gone > offline. But with > this notification, scsi device entries are not being removed. > Enabled "AAC_DEBUG_INSTRUMENT_AIF_DELETE" flag in the header file, > with this flag the driver uses "scsi_remove_device". > On the notification from firmware, the driver calls > "scsi_remove_device" for the deleted array. > This call not only informs the scsi device status to the SCSI mid > layer but also > removes corresponding scsi device entry from the linux > sysfs. OK, don't understand this. As noted below, the define AAC_DEBUG_INSTRUMENT_AIF_DELETE is set by your patch, but never appears at all in the driver ... is there missing code? > Thanks & Regards, > Mahesh > Linux Driver Development Engineer, > Adaptec ODC, Bangalore > diff -ru a/drivers/scsi/aacraid/aachba.c > b/drivers/scsi/aacraid/aachba.c > --- a/drivers/scsi/aacraid/aachba.c 2010-04-03 05:49:52.000000000 > -0700 > +++ b/drivers/scsi/aacraid/aachba.c 2010-04-07 12:35:36.468017040 > -0700 > @@ -109,6 +109,16 @@ > #define BYTE2(x) (unsigned char)((x) >> 16) > #define BYTE3(x) (unsigned char)((x) >> 24) > > + > +/* ATA pass thru commands */ > +#ifndef ATA_12 > +#define ATA_12 0xa1 /* 12-byte pass-thru */ > +#endif > + > +#ifndef ATA_16 > +#define ATA_16 0x85 /* 16-byte pass-thru */ > +#endif > + This chunk is clearly wrong ... those defines are in include/scsi/scsi.h > @@ -1647,6 +1671,27 @@ > count = (scsicmd->cmnd[7] << 8) | scsicmd->cmnd[8]; > break; > } > + > + if (expose_hidden_space <= 0) { > + if ((lba + count) > (dev->fsa_dev[scmd_id(scsicmd)].size)) { > + int cid = scmd_id(scsicmd); > + dprintk((KERN_DEBUG "aacraid: Illegal lba\n")); > + scsicmd->result = DID_OK << 16 | COMMAND_COMPLETE << 8 | > + SAM_STAT_CHECK_CONDITION; > + set_sense(&dev->fsa_dev[cid].sense_data, > + HARDWARE_ERROR, > + SENCODE_INTERNAL_TARGET_FAILURE, > + ASENCODE_INTERNAL_TARGET_FAILURE, 0, 0); > + memcpy(scsicmd->sense_buffer, > + &dev->fsa_dev[cid].sense_data, > + min_t(size_t, > + sizeof(dev->fsa_dev[cid].sense_data), > + SCSI_SENSE_BUFFERSIZE)); > + scsicmd->scsi_done(scsicmd); > + return 1; > + } > + } > + This chunk is correctly indented, that's great. > dprintk((KERN_DEBUG "aac_read[cpu %d]: lba = %llu, t = %ld.\n", > smp_processor_id(), (unsigned long long)lba, jiffies)); > if (aac_adapter_bounds(dev,scsicmd,lba)) > @@ -1727,6 +1772,26 @@ > count = (scsicmd->cmnd[7] << 8) | scsicmd->cmnd[8]; > fua = scsicmd->cmnd[1] & 0x8; > } > + > + if (expose_hidden_space <= 0) { > + if ((lba + count) > (dev->fsa_dev[scmd_id(scsicmd)].size)) { > + int cid = scmd_id(scsicmd); > + dprintk((KERN_DEBUG "aacraid: Illegal lba\n")); > + scsicmd->result = DID_OK << 16 | COMMAND_COMPLETE << 8 | > + SAM_STAT_CHECK_CONDITION; > + set_sense(&dev->fsa_dev[cid].sense_data, > + HARDWARE_ERROR, SENCODE_INTERNAL_TARGET_FAILURE, > + ASENCODE_INTERNAL_TARGET_FAILURE, 0, 0); > + memcpy(scsicmd->sense_buffer, > + &dev->fsa_dev[cid].sense_data, > + min_t(size_t, > + sizeof(dev->fsa_dev[cid].sense_data), > + SCSI_SENSE_BUFFERSIZE)); > + > + scsicmd->scsi_done(scsicmd); > + return 1; > + } > + } This one, and numerous others are wrongly indented ... the continuation lines should have an extra tab or be aligned under the opening bracket for a function/macro. Indentation like this makes the file very hard to read. > diff -ru a/drivers/scsi/aacraid/aacraid.h b/drivers/scsi/aacraid/aacraid.h > --- a/drivers/scsi/aacraid/aacraid.h 2010-04-03 05:49:52.000000000 -0700 > +++ b/drivers/scsi/aacraid/aacraid.h 2010-04-07 14:17:18.409380952 -0700 > @@ -1,6 +1,7 @@ > #ifndef dprintk > # define dprintk(x) > #endif > +#define AAC_DEBUG_INSTRUMENT_AIF_DELETE What is this? It's not used anywhere else in the file (a leftover from some removed debugging code?) > @@ -1065,6 +1065,10 @@ > #define aac_adapter_ioremap(dev, size) \ > (dev)->a_ops.adapter_ioremap(dev, size) > > +#define aac_adapter_intr(dev) \ > + ((dev)->a_ops.adapter_intr((dev)->pdev->irq, \ > + (void *)dev)) > + This is another very weird macro, but as far as I can tell, it's never used either. > --- a/drivers/scsi/aacraid/commctrl.c 2010-04-03 05:49:52.000000000 > -0700 > +++ b/drivers/scsi/aacraid/commctrl.c 2010-04-07 12:52:53.803318104 > -0700 > @@ -128,13 +128,13 @@ > retval = > aac_fib_send(le16_to_cpu(kfib->header.Command), fibptr, > le16_to_cpu(kfib->header.Size) , > FsaNormal, > 1, 1, NULL, NULL); > - if (retval) { > - goto cleanup; > - } > if (aac_fib_complete(fibptr) != 0) { > + if (!retval) > retval = -EINVAL; > - goto cleanup; > + goto cleanup; This looks wrong ... it's wrongly indented, but I'm assuming you meant to remove the goto altogether so it gets swept up in the code below? > } > + if (retval) > + goto cleanup; > @@ -534,6 +532,13 @@ > "update mother board > BIOS or consider utilizing one of\n" > "the SAFE mode > kernel options (acpi, apic etc)\n"); > } > + spin_lock_irqsave(&fibptr->event_lock, > + flags); > + fibptr->done = 2; > + spin_unlock_irqrestore( > + &fibptr->event_lock, > + flags); So these locks ... even badly indented ... don't look necessary: fibptr->done is a u32. On all architectures that's updated atomically. This means that the check on fibptr->done always reads either the old value or the new value ... there's no need to add extra locking unless there are other values or states that need updating atomically. > + dprintk((KERN_ERR "aacraid: > sync. command timed out after 180 seconds\n")); James ��.n��������+%������w��{.n�����{������ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f