-----Original Message----- From: Martin Wilck [mailto:mwilck@xxxxxxxx] Sent: Thursday, January 7, 2021 6:14 PM To: Don Brace - C33706 <Don.Brace@xxxxxxxxxxxxx>; Kevin Barnett - C33748 <Kevin.Barnett@xxxxxxxxxxxxx>; Scott Teel - C33730 <Scott.Teel@xxxxxxxxxxxxx>; Justin Lindley - C33718 <Justin.Lindley@xxxxxxxxxxxxx>; Scott Benesh - C33703 <Scott.Benesh@xxxxxxxxxxxxx>; Gerry Morong - C33720 <Gerry.Morong@xxxxxxxxxxxxx>; Mahesh Rajashekhara - I30583 <Mahesh.Rajashekhara@xxxxxxxxxxxxx>; hch@xxxxxxxxxxxxx; jejb@xxxxxxxxxxxxxxxxxx; joseph.szczypek@xxxxxxx; POSWALD@xxxxxxxx Cc: linux-scsi@xxxxxxxxxxxxxxx Subject: Re: [PATCH V3 10/25] smartpqi: add stream detection EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe On Thu, 2020-12-10 at 14:35 -0600, Don Brace wrote: > * Enhance performance by adding sequential stream detection. > for R5/R6 sequential write requests. > * Reduce stripe lock contention with full-stripe write > operations. I suppose that "stripe lock" is used by the firmware? Could you elaborate a bit more how this technique improves performance? > + /* > + * If controller does not support AIO RAID{5,6} writes, need > to send > + * requests down non-AIO path. > + */ > + if ((device->raid_level == SA_RAID_5 && !ctrl_info- > >enable_r5_writes) || > + (device->raid_level == SA_RAID_6 && !ctrl_info- > >enable_r6_writes)) > + return true; > + > + lru_index = 0; > + oldest_jiffies = INT_MAX; > + for (i = 0; i < NUM_STREAMS_PER_LUN; i++) { > + pqi_stream_data = &device->stream_data[i]; > + /* > + * Check for adjacent request or request is within > + * the previous request. > + */ > + if ((pqi_stream_data->next_lba && > + rmd.first_block >= pqi_stream_data->next_lba) > && > + rmd.first_block <= pqi_stream_data->next_lba > + > + rmd.block_cnt) { Here you seem to assume that the previous write had the same block_cnt. What's the justification for that? Don: There is a maximum request size for RAID 5/RAID 6 write requests. So we are assuming that if a sequential stream is detected, the stream is comprised of similar request sizes. In fact for coalescing, the LBAs need to be continuous or nearly contiguous, otherwise the RAID engine will not wait for a full-stripe. I have updated the patch description accordingly. > + pqi_stream_data->next_lba = rmd.first_block + > + rmd.block_cnt; > + pqi_stream_data->last_accessed = jiffies; > + return true; > + } > + > + /* unused entry */ > + if (pqi_stream_data->last_accessed == 0) { > + lru_index = i; > + break; > + } > + > + /* Find entry with oldest last accessed time. */ > + if (pqi_stream_data->last_accessed <= oldest_jiffies) > { > + oldest_jiffies = pqi_stream_data- > >last_accessed; > + lru_index = i; > + } > + } > + > + /* Set LRU entry. */ > + pqi_stream_data = &device->stream_data[lru_index]; > + pqi_stream_data->last_accessed = jiffies; > + pqi_stream_data->next_lba = rmd.first_block + rmd.block_cnt; > + > + return false; > +} > + > +static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct > scsi_cmnd *scmd) > { > int rc; > struct pqi_ctrl_info *ctrl_info; @@ -5768,11 +5842,12 @@ > static int pqi_scsi_queue_command(struct Scsi_Host *shost, > raid_bypassed = false; > if (device->raid_bypass_enabled && > !blk_rq_is_passthrough(scmd->request)) { > - rc = > pqi_raid_bypass_submit_scsi_cmd(ctrl_info, device, > - scmd, queue_group); > - if (rc == 0 || rc == SCSI_MLQUEUE_HOST_BUSY) > { > - raid_bypassed = true; > - atomic_inc(&device->raid_bypass_cnt); > + if (!pqi_is_parity_write_stream(ctrl_info, > scmd)) { > + rc = > pqi_raid_bypass_submit_scsi_cmd(ctrl_info, device, scmd, queue_group); > + if (rc == 0 || rc == > SCSI_MLQUEUE_HOST_BUSY) { > + raid_bypassed = true; > + atomic_inc(&device- > >raid_bypass_cnt); > + } > } > } > if (!raid_bypassed) >