>-----Original Message----- >From: Ric Wheeler [mailto:rwheeler@xxxxxxxxxx] >Sent: Monday, October 17, 2016 6:31 PM >To: Hannes Reinecke; Sumit Saxena; linux-scsi@xxxxxxxxxxxxxxx >Cc: martin.petersen@xxxxxxxxxx; thenzl@xxxxxxxxxx; jejb@xxxxxxxxxxxxxxxxxx; >kashyap.desai@xxxxxxxxxxxx; Christoph Hellwig; Martin K. Petersen; Jeff >Moyer; Gris Ge; Ewan Milne; Jens Axboe >Subject: Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to >firmware > >On 10/17/2016 07:34 AM, Hannes Reinecke wrote: >> On 10/17/2016 12:24 PM, Sumit Saxena wrote: >>> megaraid_sas driver returns SYNCHRONIZE_CACHE command back to SCSI >>> layer without sending it to firmware as firmware takes care of flushing cache. >>> This patch will change the driver behavior wrt SYNCHRONIZE_CACHE handling. >>> If underlying firmware has support to handle the SYNCHORNIZE_CACHE, >>> driver will send it for firmware otherwise complete it back to SCSI >>> layer with SUCCESS immediately. >>> If Firmware handle SYNCHORNIZE_CACHE for both VD and JBOD >>> "canHandleSyncCache" bit in scratch pad register(offset 0x00B4) will >>> be set. >>> >>> This behavior can be controlled via module parameter and user can >>> fallback to old behavior of returning SYNCHRONIZE_CACHE by driver >>> only without sending it to firmware. >>> >>> Signed-off-by: Sumit Saxena <sumit.saxena@xxxxxxxxxxxx> >>> Signed-off-by: Kashyap Desai <kashyap.desai@xxxxxxxxxxxx> >>> --- >>> drivers/scsi/megaraid/megaraid_sas.h | 3 +++ >>> drivers/scsi/megaraid/megaraid_sas_base.c | 14 ++++++-------- >>> drivers/scsi/megaraid/megaraid_sas_fusion.c | 3 +++ >>> drivers/scsi/megaraid/megaraid_sas_fusion.h | 2 ++ >>> 4 files changed, 14 insertions(+), 8 deletions(-) >>> >>> diff --git a/drivers/scsi/megaraid/megaraid_sas.h >>> b/drivers/scsi/megaraid/megaraid_sas.h >>> index ca86c88..43fd14f 100644 >>> --- a/drivers/scsi/megaraid/megaraid_sas.h >>> +++ b/drivers/scsi/megaraid/megaraid_sas.h >>> @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT { >>> #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT 14 >>> #define MR_MAX_MSIX_REG_ARRAY 16 >>> #define MR_RDPQ_MODE_OFFSET 0X00800000 >>> +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET 0X01000000 >>> + >>> /* >>> * register set for both 1068 and 1078 controllers >>> * structure extended for 1078 registers @@ -2140,6 +2142,7 @@ >>> struct megasas_instance { >>> u8 is_imr; >>> u8 is_rdpq; >>> bool dev_handle; >>> + bool fw_sync_cache_support; >>> }; >>> struct MR_LD_VF_MAP { >>> u32 size; >>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c >>> b/drivers/scsi/megaraid/megaraid_sas_base.c >>> index 092387f..a4a8e2f 100644 >>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c >>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c >>> @@ -104,6 +104,10 @@ unsigned int scmd_timeout = >MEGASAS_DEFAULT_CMD_TIMEOUT; >>> module_param(scmd_timeout, int, S_IRUGO); >>> MODULE_PARM_DESC(scmd_timeout, "scsi command timeout (10-90s), >>> default 90s. See megasas_reset_timer."); >>> >>> +bool block_sync_cache; >>> +module_param(block_sync_cache, bool, S_IRUGO); >>> +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by driver >>> +Default: 0(Send it to firmware)"); >>> + >>> MODULE_LICENSE("GPL"); >>> MODULE_VERSION(MEGASAS_VERSION); >>> MODULE_AUTHOR("megaraidlinux.pdl@xxxxxxxxxxxxx"); >>> @@ -1700,16 +1704,10 @@ megasas_queue_command(struct Scsi_Host >*shost, struct scsi_cmnd *scmd) >>> goto out_done; >>> } >>> >>> - switch (scmd->cmnd[0]) { >>> - case SYNCHRONIZE_CACHE: >>> - /* >>> - * FW takes care of flush cache on its own >>> - * No need to send it down >>> - */ >>> + if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) && >>> + (!instance->fw_sync_cache_support)) { >>> scmd->result = DID_OK << 16; >>> goto out_done; >>> - default: >>> - break; >>> } >>> >>> return instance->instancet->build_and_issue_cmd(instance, scmd); >>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c >>> b/drivers/scsi/megaraid/megaraid_sas_fusion.c >>> index 2159f6a..8237580 100644 >>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c >>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c >>> @@ -747,6 +747,9 @@ megasas_ioc_init_fusion(struct megasas_instance >*instance) >>> ret = 1; >>> goto fail_fw_init; >>> } >>> + if (!block_sync_cache) >>> + instance->fw_sync_cache_support = (scratch_pad_2 & >>> + MR_CAN_HANDLE_SYNC_CACHE_OFFSET) ? 1 : 0; >>> >>> IOCInitMessage = >>> dma_alloc_coherent(&instance->pdev->dev, >>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h >>> b/drivers/scsi/megaraid/megaraid_sas_fusion.h >>> index e3bee04..2857154 100644 >>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h >>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h >>> @@ -72,6 +72,8 @@ >>> #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET (0x0000030C) >>> #define MPI2_REPLY_POST_HOST_INDEX_OFFSET (0x0000006C) >>> >>> +extern bool block_sync_cache; >>> + >>> /* >>> * Raid context flags >>> */ >>> >> Be extra careful with that. >> >> SYNCHRONIZE_CACHE has (potentially) a global scope, as there might be >> an array-wide cache, and a cache flush would affect the entire cache. >> Linux is using SYNCHRONIZE_CACHE as a per device setting, ie it >> assumes that the effects of a cache flush is restricted to the device in question. >> >> If this is _not_ the case I'd rather not enable it. >> Have you checked that enabling this functionality doesn't have >> negative performance impact? >> >> Cheers, >> >> Hannes > >This must go in - without this fix, there is no data integrity for any file system. > >In effect, this driver by default has been throwing away SYNCHRONIZE_CACHE >commands even when acting in JBOD/non-RAID mode. For JBOD mode, we are planning to send SYNCHRONIZE_CACHE unconditionally for all generation Controllers and FW. Since there is single driver for all generation controllers so we are doing some Testing that opening SYNCHRONIZE_CACHE should not break any controller. I will be sending follow up patch for the same as soon as I am done. > >Of course, actually doing a SYNCHRONIZE_CACHE to drives will be slower than >throwing it away, but this is not optional. > >We really need to have some ways to validate that our IO stack is properly and >safely configured. > >I would love to see a couple of things: > >* having T10 & T13 report the existence of a volatile write cache - this is different >than WCE set, some devices have a write cache and are battery/flash backed. > >* having a robust tool to test over power failure/disconnect that our assumptions >are true - any write is durable after a SYNCHRONIZE_CACHE or >CACHE_FLUSH_EXT is sent and ack'ed > >Regards, > >Ric > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html