Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Tomas,

SCSI command checking in queuecommand function of arcmsr can be removed safely. 
Now driver can pass all scsi command to controller firmware.

diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c	2016-10-19 16:18:45.000000000 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c	2016-10-19 16:31:59.665524699 +0800
@@ -2623,13 +2623,6 @@ static int arcmsr_queue_command_lck(stru
 	cmd->scsi_done = done;
 	cmd->host_scribble = NULL;
 	cmd->result = 0;
-	if ((scsicmd == SYNCHRONIZE_CACHE) ||(scsicmd == SEND_DIAGNOSTIC)){
-		if(acb->devstate[target][lun] == ARECA_RAID_GONE) {
-    			cmd->result = (DID_NO_CONNECT << 16);
-		}
-		cmd->scsi_done(cmd);
-		return 0;
-	}
 	if (target == 16) {
 		/* virtual device for iop message transfer */
 		arcmsr_handle_virtual_command(acb, cmd);

Thanks
Ching
On Tue, 2016-10-18 at 15:56 +0200, Tomas Henzl wrote:
> Hi,
> 
> similar suspicious code path can be found in the queuecommand functions in other drivers too
> these are -
> pmcraid.c
> arcmsr_hba.c
> cc-ing maintainers - 
> (but both drivers seem to be unmaintained for a while -
> I've added Ching for arcmsr and Raghava for pmcraid)
> 
> please read this thread and verify that your driver and firmware is safe.
> 
> It is expected that a raid card controls the disk write cache (switches it off)
> but this might not be true for all modes of operation a raid adapter handles
> - pass through/non-RAID etc. In such case when disk write cache is enabled
> and your driver skips the SYNCHRONIZE_CACHE command a FS corruption
> is very much possible during unexpected power loss or even a clean shutdown.
> 
> tomash
> 
> On 17.10.2016 19:51, James Bottomley wrote:
> > On Mon, 2016-10-17 at 23:06 +0530, Kashyap Desai wrote:
> >>> -----Original Message-----
> >>> From: James Bottomley [mailto:jejb@xxxxxxxxxxxxxxxxxx]
> >>> Sent: Monday, October 17, 2016 10:50 PM
> >>> To: Ric Wheeler; Hannes Reinecke; Sumit Saxena; 
> >>> linux-scsi@xxxxxxxxxxxxxxx
> >>> Cc: martin.petersen@xxxxxxxxxx; thenzl@xxxxxxxxxx;
> >>> kashyap.desai@xxxxxxxxxxxx; Christoph Hellwig; Martin K. Petersen;
> >>> Jeff
> >>> Moyer; Gris Ge; Ewan Milne; Jens Axboe
> >>> Subject: Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE
> >>> command
> >>> to firmware
> >>>
> >>> On Mon, 2016-10-17 at 12:27 -0400, Ric Wheeler wrote:
> >>>> On 10/17/2016 12:20 PM, James Bottomley wrote:
> >>>>> On Mon, 2016-10-17 at 09:01 -0400, Ric Wheeler wrote:
> >>>>>> On 10/17/2016 07:34 AM, Hannes Reinecke wrote:
> >>>>>>> On 10/17/2016 12:24 PM, Sumit Saxena wrote:
> >>>>>>>> megaraid_sas driver returns SYNCHRONIZE_CACHE command
> >>>>>>>> back to
> >>>>>>>> SCSI layer without sending it to firmware as firmware
> >>>>>>>> takes
> >>>>>>>> care of flushing cache.  This patch will change the
> >>>>>>>> driver
> >>>>>>>> behavior wrt SYNCHRONIZE_CACHE handling. If underlying
> >>>>>>>> firmware has support to handle the SYNCHORNIZE_CACHE,
> >>>>>>>> driver
> >>>>>>>> will send it for firmware otherwise complete it back to
> >>>>>>>> SCSI
> >>>>>>>> layer with SUCCESS immediately.  If  Firmware  handle
> >>>>>>>> SYNCHORNIZE_CACHE for both VD and JBOD
> >>>>>>>> "canHandleSyncCache"
> >>>>>>>> bit in scratch pad register(offset
> >>>>>>>> 0x00B4) will be set.
> >>>>>>>>
> >>>>>>>> This behavior can be controlled via module parameter and
> >>>>>>>> user
> >>>>>>>> can fallback to old behavior of returning
> >>>>>>>> SYNCHRONIZE_CACHE by
> >>>>>>>> driver only without sending it to firmware.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Sumit Saxena <sumit.saxena@xxxxxxxxxxxx>
> >>>>>>>> Signed-off-by: Kashyap Desai <kashyap.desai@xxxxxxxxxxxx>
> >>>>>>>> ---
> >>>>>>>>    drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
> >>>>>>>>    drivers/scsi/megaraid/megaraid_sas_base.c   | 14
> >>>>>>>> ++++++---
> >>>>>>>> -----
> >>>>>>>>    drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
> >>>>>>>>    drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
> >>>>>>>>    4 files changed, 14 insertions(+), 8 deletions(-)
> >>>>>>>>
> >>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas.h
> >>>>>>>> b/drivers/scsi/megaraid/megaraid_sas.h
> >>>>>>>> index ca86c88..43fd14f 100644
> >>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas.h
> >>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas.h
> >>>>>>>> @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
> >>>>>>>>    #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT    14
> >>>>>>>>    #define MR_MAX_MSIX_REG_ARRAY                   16
> >>>>>>>>    #define MR_RDPQ_MODE_OFFSET			0X0
> >>>>>>>> 0800
> >>>>>>>> 000
> >>>>>>>> +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET		0
> >>>>>>>> X010
> >>>>>>>> 0000
> >>>>>>>> 0
> >>>>>>>> +
> >>>>>>>>    /*
> >>>>>>>>    * register set for both 1068 and 1078 controllers
> >>>>>>>>    * structure extended for 1078 registers @@ -2140,6
> >>>>>>>> +2142,7
> >>>>>>>> @@ struct megasas_instance {
> >>>>>>>>    	u8 is_imr;
> >>>>>>>>    	u8 is_rdpq;
> >>>>>>>>    	bool dev_handle;
> >>>>>>>> +	bool fw_sync_cache_support;
> >>>>>>>>    };
> >>>>>>>>    struct MR_LD_VF_MAP {
> >>>>>>>>    	u32 size;
> >>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
> >>>>>>>> b/drivers/scsi/megaraid/megaraid_sas_base.c
> >>>>>>>> index 092387f..a4a8e2f 100644
> >>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
> >>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
> >>>>>>>> @@ -104,6 +104,10 @@ unsigned int scmd_timeout =
> >>>>>>>> MEGASAS_DEFAULT_CMD_TIMEOUT;
> >>>>>>>>    module_param(scmd_timeout, int, S_IRUGO);
> >>>>>>>>    MODULE_PARM_DESC(scmd_timeout, "scsi command timeout
> >>>>>>>> (10
> >>>>>>>> -90s), default 90s. See megasas_reset_timer.");
> >>>>>>>>
> >>>>>>>> +bool block_sync_cache;
> >>>>>>>> +module_param(block_sync_cache, bool, S_IRUGO);
> >>>>>>>> +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by
> >>>>>>>> driver
> >>>>>>>> Default: 0(Send it to firmware)");
> >>>>>>>> +
> >>>>>>>>    MODULE_LICENSE("GPL");
> >>>>>>>>    MODULE_VERSION(MEGASAS_VERSION);
> >>>>>>>>    MODULE_AUTHOR("megaraidlinux.pdl@xxxxxxxxxxxxx");
> >>>>>>>> @@ -1700,16 +1704,10 @@ megasas_queue_command(struct
> >>> Scsi_Host
> >>>>>>>> *shost, struct scsi_cmnd *scmd)
> >>>>>>>>    		goto out_done;
> >>>>>>>>    	}
> >>>>>>>>
> >>>>>>>> -	switch (scmd->cmnd[0]) {
> >>>>>>>> -	case SYNCHRONIZE_CACHE:
> >>>>>>>> -		/*
> >>>>>>>> -		 * FW takes care of flush cache on its
> >>>>>>>> own
> >>>>>>>> -		 * No need to send it down
> >>>>>>>> -		 */
> >>>>>>>> +	if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
> >>>>>>>> +		(!instance->fw_sync_cache_support)) {
> >>>>>>>>    		scmd->result = DID_OK << 16;
> >>>>>>>>    		goto out_done;
> >>>>>>>> -	default:
> >>>>>>>> -		break;
> >>>>>>>>    	}
> >>>>>>>>
> >>>>>>>>    	return instance->instancet
> >>>>>>>> ->build_and_issue_cmd(instance, scmd);
> >>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> >>>>>>>> b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> >>>>>>>> index 2159f6a..8237580 100644
> >>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> >>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> >>>>>>>> @@ -747,6 +747,9 @@ megasas_ioc_init_fusion(struct
> >>>>>>>> megasas_instance *instance)
> >>>>>>>>    		ret = 1;
> >>>>>>>>    		goto fail_fw_init;
> >>>>>>>>    	}
> >>>>>>>> +	if (!block_sync_cache)
> >>>>>>>> +		instance->fw_sync_cache_support =
> >>>>>>>> (scratch_pad_2
> >>>>>>>> &
> >>>>>>>> +			MR_CAN_HANDLE_SYNC_CACHE_OFFSET)
> >>>>>>>> ? 1
> >>>>>>>> :
> >>>>>>>> 0;
> >>>>>>>>
> >>>>>>>>    	IOCInitMessage =
> >>>>>>>>    	  dma_alloc_coherent(&instance->pdev->dev,
> >>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h
> >>>>>>>> b/drivers/scsi/megaraid/megaraid_sas_fusion.h
> >>>>>>>> index e3bee04..2857154 100644
> >>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
> >>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
> >>>>>>>> @@ -72,6 +72,8 @@
> >>>>>>>>    #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET
> >>>>>>>>  (0x0000030C)
> >>>>>>>>    #define MPI2_REPLY_POST_HOST_INDEX_OFFSET	(0x00
> >>>>>>>> 0000
> >>>>>>>> 6C)
> >>>>>>>>
> >>>>>>>> +extern bool block_sync_cache;
> >>>>>>>> +
> >>>>>>>>    /*
> >>>>>>>>     * Raid context flags
> >>>>>>>>     */
> >>>>>>>>
> >>>>>>> Be extra careful with that.
> >>>>>>>
> >>>>>>> SYNCHRONIZE_CACHE has (potentially) a global scope, as
> >>>>>>> there
> >>>>>>> might be an array-wide cache, and a cache flush would
> >>>>>>> affect the
> >>>>>>> entire cache. Linux is using SYNCHRONIZE_CACHE as a per
> >>>>>>> device
> >>>>>>> setting, ie it assumes that the effects of a cache flush is
> >>>>>>> restricted to the device in question.
> >>>>>>>
> >>>>>>> If this is _not_ the case I'd rather not enable it.
> >>>>>>> Have you checked that enabling this functionality doesn't
> >>>>>>> have
> >>>>>>> negative performance impact?
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>>
> >>>>>>> Hannes
> >>>>>> This must go in - without this fix, there is no data
> >>>>>> integrity for
> >>>>>> any file system.
> >>>>> That's not what I get from the change log.  What it says to me
> >>>>> is
> >>>>> that the caches are currently firmware managed.  Barring
> >>>>> firmware
> >>>>> bugs, that means that we currently don't have any integrity
> >>>>> issues.
> >>>> Your understanding (or the change log) is incorrect.
> >>> There's no current way in English of construing "as firmware takes
> >>> care of
> >>> flushing cache" to mean its inverse, so the changelog needs
> >>> updating to
> >>> explain
> >>> that firmware does *not* take care of cache flushing, so
> >>> effectively
> >>> nothing
> >>> does and we'll need a cc to stable if the problem is that nothing
> >>> takes
> >>> care of
> >>> flushing the drive caches.
> >>>
> >>> James
> >> Sorry for confusion. More accurate to say  -
> >>
> >> Current megaraid_sas driver returns SYNCHRONIZE_CACHE command back to 
> >> SCSI layer without sending it down to firmware as firmware supposed 
> >> to takes care of flushing cache for all Virtual Disk at the time of 
> >> system reboot/shutdown. Because of above FW rule driver coded wrongly 
> >> and it added bug that SYNCHRONIZE_CACHE is not passed to FW for JBOD 
> >> as well.  (MR Firmware wanted OS to manage SYNCHRONIZE_CACHE and pass 
> >> the same to the Firmware. ). We figure out that even for VD, why 
> >> driver should send back SYNC_CACHE command...let's have the same in
> >> FW (giving all control in FW)
> >>
> >> New behavior described -
> >>
> >> IF application requests SYNCH_CACHE
> >>  IF any FW which supports new API bit called canHandleSyncCache
> >>   Driver sends SYNCH_CACHE command to the FW
> >>   IF 'VirtualDisk'
> >>       FW does not send SYNCH_CACHE to drives
> >>       FW returns SUCCESS
> >> ELSE IF 'JBOD'
> >>      FW sends SYNCH_CACHE to drive
> >>      FW obtains status from drive and returns same status back to
> >> driver
> >>   ENDIF
> >>
> >> Fixing this is robust if FW and driver changes are inline. See below 
> >> for more detail.
> >>
> >> Case - 1
> >> This patch is attempt to fix one level problem where Virtual Disks 
> >> are not managed correctly in MR FW. There are some MR FW (E.a Gen2 
> >> and Gen2.5 FW), which set WCE bit for Virtual Disk but FW is not 
> >> reply correct for SYNCH_CACHE.  This was handled in past and carry 
> >> forward (in driver + FW ) to all new generation products as well. We 
> >> tried to collect all possible details why it was handled such way to 
> >> provide better driver fix for this issue(mainly to avoid this FW 
> >> checks and module  parameters etc..). But now it looks like to make 
> >> generic fix (Device ID based etc..)is even risky, so went with this 
> >> protective approach.  It is very risky to find out which Device ID 
> >> and FW has capability to manage SYNC_CACHE, so we managed to figure 
> >> out which are the new FW using FW capability bit.
> >>
> >> E.g Liberator/Thunderbolt MR FW SYNC_CACHE is write Unsupported 
> >> command (this is a firmware bug) and immediately it will be failed to 
> >> SML. This will cause File system to go Read-only.
> > That's a better description ... what you're saying is that the patch
> > doesn't actually fix the bug Ric is worrying about.
> >
> >> Case -2
> >> One more thing which we are trying and known driver can send 
> >> "SYNC_CACHE" unconditionally to all generation of FW.  For this we 
> >> are doing more unit testing on various controllers/FW and planning to 
> >> provide another patch to fix JBOD path for any FW. (It will not be 
> >> based on FW capability/module parameter).
> > OK, but we really need this ASAP.  As Ric said, data integrity is at
> > risk until this is fixed.  Can you also reference the commit that
> > introduced the problem so we know how far to do the sable backports?
> >
> >> If we remove module parameter, we can ask customer to disable WCE on
> >> drive to get similar impact.
> > Thanks,
> >
> > James
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux