On Fri, 2016-02-26 at 12:33 -0800, Sheng Yang wrote: > On Fri, Feb 26, 2016 at 12:00 PM, Andy Grover <agrover@xxxxxxxxxx> wrote: > > On 02/26/2016 11:43 AM, Sheng Yang wrote: > >>> > >>> I thought we could allocate an exactly 1MB buffer, with your changes, no? > >>> That warning message may need to be changed. > >> > >> > >> Yes we can, it make sense to remove the warning message. > >> > >> Though I am not quite comfortable with one request fill the whole > >> buffer, because we would only able to handle one request with the > >> whole ring. > >> > >> Like we discussed in > >> http://comments.gmane.org/gmane.linux.scsi.target.devel/11107 , we > >> should have a ring buffer fits all requests from upper layer. > > > > > > I still don't have a good answer for what we should do here. If we want to > > accommodate 1MB requests, and we want to have a reasonable queue depth (64? > > 128?) then we would be vmalloc()ing more than 64MB for each TCMU device. > > That seems too wasteful to me. > > I think in that case we don't want to handle 1MB request. We can limit > the max sectors one command can handle to e.g. 128 sectors/64kb, then > 128 commands in queue, so that's 8M for data ring, which sound pretty > reasonable. > > The problem is I don't know how to limit it on TCMU. I can only find a > way to do it in tcm loopback device. > So a backend driver declares it's supported max_sectors per I/O using dev->dev_attrib.hw_max_sectors, which is reported back to the initiator via block limits (VPD=0xb0) in MAXIMUM TRANSFER LENGTH. However, not all initiator hosts honor MAXIMUM TRANSFER LENGTH, and these hosts will continue to send I/Os larger than hw_max_sectors. The work-around we settled on last year to address this (at fabric driver level) for qla2xxx was to include a new TFO->max_data_sg_nents, which declares the HW fabric limit for maximum I/O size here: target/qla2xxx: Honor max_data_sg_nents I/O transfer limit https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8f9b565 When an I/O is received that is larger than the fabric HW limit, the residual_count is set for the payload that could not be transfered, so the host will reissue the remainder upon completion in a new I/O. The same logic for generating a residual_count for I/O's exceeding a particular backend's hw_max_sectors could be easily added in target_core_transport.c:target_check_max_data_sg_nents(). Something akin to: diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c index 6348ae0..7c929c0 100644 --- a/drivers/target/target_core_transport.c +++ b/drivers/target/target_core_transport.c @@ -1143,17 +1143,17 @@ static sense_reason_t target_check_max_data_sg_nents(struct se_cmd *cmd, struct se_device *dev, unsigned int size) { - u32 mtl; - - if (!cmd->se_tfo->max_data_sg_nents) - return TCM_NO_SENSE; + u32 mtl, fabric_mtl, backend_mtl; /* * Check if fabric enforced maximum SGL entries per I/O descriptor * exceeds se_cmd->data_length. If true, set SCF_UNDERFLOW_BIT + * residual_count and reduce original cmd->data_length to maximum * length based on single PAGE_SIZE entry scatter-lists. */ - mtl = (cmd->se_tfo->max_data_sg_nents * PAGE_SIZE); + fabric_mtl = (cmd->se_tfo->max_data_sg_nents * PAGE_SIZE); + backend_mtl = (dev->dev_attrib.hw_max_sectors * dev->dev_attrib.block_size); + mtl = min_not_zero(fabric_mtl, backend_mtl); + if (cmd->data_length > mtl) { /* * If an existing CDB overflow is present, calculate new residual -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html