Re: TCMU buffer sizing (was Re: [PATCH 3/4] target/user: Introduce data_bitmap, replace data_length/data_head/data_tail)

"Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> · Sat, 27 Feb 2016 16:16:14 -0800

On Fri, 2016-02-26 at 12:33 -0800, Sheng Yang wrote:
> On Fri, Feb 26, 2016 at 12:00 PM, Andy Grover <agrover@xxxxxxxxxx> wrote:
> > On 02/26/2016 11:43 AM, Sheng Yang wrote:
> >>>
> >>> I thought we could allocate an exactly 1MB buffer, with your changes, no?
> >>> That warning message may need to be changed.
> >>
> >>
> >> Yes we can, it make sense to remove the warning message.
> >>
> >> Though I am not quite comfortable with one request fill the whole
> >> buffer, because we would only able to handle one request with the
> >> whole ring.
> >>
> >> Like we discussed in
> >> http://comments.gmane.org/gmane.linux.scsi.target.devel/11107 , we
> >> should have a ring buffer fits all requests from upper layer.
> >
> >
> > I still don't have a good answer for what we should do here. If we want to
> > accommodate 1MB requests, and we want to have a reasonable queue depth (64?
> > 128?) then we would be vmalloc()ing more than 64MB for each TCMU device.
> > That seems too wasteful to me.
> 
> I think in that case we don't want to handle 1MB request. We can limit
> the max sectors one command can handle to e.g. 128 sectors/64kb, then
> 128 commands in queue, so that's 8M for data ring, which sound pretty
> reasonable.
> 
> The problem is I don't know how to limit it on TCMU. I can only find a
> way to do it in tcm loopback device.
> 

So a backend driver declares it's supported max_sectors per I/O using
dev->dev_attrib.hw_max_sectors, which is reported back to the initiator
via block limits (VPD=0xb0) in MAXIMUM TRANSFER LENGTH.

However, not all initiator hosts honor MAXIMUM TRANSFER LENGTH, and
these hosts will continue to send I/Os larger than hw_max_sectors.

The work-around we settled on last year to address this (at fabric
driver level) for qla2xxx was to include a new TFO->max_data_sg_nents,
which declares the HW fabric limit for maximum I/O size here:

target/qla2xxx: Honor max_data_sg_nents I/O transfer limit
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8f9b565

When an I/O is received that is larger than the fabric HW limit, the
residual_count is set for the payload that could not be transfered, so
the host will reissue the remainder upon completion in a new I/O.

The same logic for generating a residual_count for I/O's exceeding a
particular backend's hw_max_sectors could be easily added in
target_core_transport.c:target_check_max_data_sg_nents().

Something akin to:

diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
index 6348ae0..7c929c0 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -1143,17 +1143,17 @@ static sense_reason_t
 target_check_max_data_sg_nents(struct se_cmd *cmd, struct se_device *dev,
                               unsigned int size)
 {
-       u32 mtl;
-
-       if (!cmd->se_tfo->max_data_sg_nents)
-               return TCM_NO_SENSE;
+       u32 mtl, fabric_mtl, backend_mtl;
        /*
         * Check if fabric enforced maximum SGL entries per I/O descriptor
         * exceeds se_cmd->data_length.  If true, set SCF_UNDERFLOW_BIT +
         * residual_count and reduce original cmd->data_length to maximum
         * length based on single PAGE_SIZE entry scatter-lists.
         */
-       mtl = (cmd->se_tfo->max_data_sg_nents * PAGE_SIZE);
+       fabric_mtl = (cmd->se_tfo->max_data_sg_nents * PAGE_SIZE);
+       backend_mtl = (dev->dev_attrib.hw_max_sectors * dev->dev_attrib.block_size);
+       mtl = min_not_zero(fabric_mtl, backend_mtl);
+
        if (cmd->data_length > mtl) {
                /*
                 * If an existing CDB overflow is present, calculate new residual


--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html