Re: TCMU buffer sizing (was Re: [PATCH 3/4] target/user: Introduce data_bitmap, replace data_length/data_head/data_tail)

Sheng Yang <sheng@xxxxxxxxxx> · Wed, 2 Mar 2016 17:43:05 -0800

On Sat, Feb 27, 2016 at 4:16 PM, Nicholas A. Bellinger
<nab@xxxxxxxxxxxxxxx> wrote:
> On Fri, 2016-02-26 at 12:33 -0800, Sheng Yang wrote:
>> On Fri, Feb 26, 2016 at 12:00 PM, Andy Grover <agrover@xxxxxxxxxx> wrote:
>> > On 02/26/2016 11:43 AM, Sheng Yang wrote:
>> >>>
>> >>> I thought we could allocate an exactly 1MB buffer, with your changes, no?
>> >>> That warning message may need to be changed.
>> >>
>> >>
>> >> Yes we can, it make sense to remove the warning message.
>> >>
>> >> Though I am not quite comfortable with one request fill the whole
>> >> buffer, because we would only able to handle one request with the
>> >> whole ring.
>> >>
>> >> Like we discussed in
>> >> http://comments.gmane.org/gmane.linux.scsi.target.devel/11107 , we
>> >> should have a ring buffer fits all requests from upper layer.
>> >
>> >
>> > I still don't have a good answer for what we should do here. If we want to
>> > accommodate 1MB requests, and we want to have a reasonable queue depth (64?
>> > 128?) then we would be vmalloc()ing more than 64MB for each TCMU device.
>> > That seems too wasteful to me.
>>
>> I think in that case we don't want to handle 1MB request. We can limit
>> the max sectors one command can handle to e.g. 128 sectors/64kb, then
>> 128 commands in queue, so that's 8M for data ring, which sound pretty
>> reasonable.
>>
>> The problem is I don't know how to limit it on TCMU. I can only find a
>> way to do it in tcm loopback device.
>>
>
> So a backend driver declares it's supported max_sectors per I/O using
> dev->dev_attrib.hw_max_sectors, which is reported back to the initiator
> via block limits (VPD=0xb0) in MAXIMUM TRANSFER LENGTH.
>
> However, not all initiator hosts honor MAXIMUM TRANSFER LENGTH, and
> these hosts will continue to send I/Os larger than hw_max_sectors.
>
> The work-around we settled on last year to address this (at fabric
> driver level) for qla2xxx was to include a new TFO->max_data_sg_nents,
> which declares the HW fabric limit for maximum I/O size here:
>
> target/qla2xxx: Honor max_data_sg_nents I/O transfer limit
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8f9b565
>
> When an I/O is received that is larger than the fabric HW limit, the
> residual_count is set for the payload that could not be transfered, so
> the host will reissue the remainder upon completion in a new I/O.
>
> The same logic for generating a residual_count for I/O's exceeding a
> particular backend's hw_max_sectors could be easily added in
> target_core_transport.c:target_check_max_data_sg_nents().
>
> Something akin to:
>
> diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
> index 6348ae0..7c929c0 100644
> --- a/drivers/target/target_core_transport.c
> +++ b/drivers/target/target_core_transport.c
> @@ -1143,17 +1143,17 @@ static sense_reason_t
>  target_check_max_data_sg_nents(struct se_cmd *cmd, struct se_device *dev,
>                                unsigned int size)
>  {
> -       u32 mtl;
> -
> -       if (!cmd->se_tfo->max_data_sg_nents)
> -               return TCM_NO_SENSE;
> +       u32 mtl, fabric_mtl, backend_mtl;
>         /*
>          * Check if fabric enforced maximum SGL entries per I/O descriptor
>          * exceeds se_cmd->data_length.  If true, set SCF_UNDERFLOW_BIT +
>          * residual_count and reduce original cmd->data_length to maximum
>          * length based on single PAGE_SIZE entry scatter-lists.
>          */
> -       mtl = (cmd->se_tfo->max_data_sg_nents * PAGE_SIZE);
> +       fabric_mtl = (cmd->se_tfo->max_data_sg_nents * PAGE_SIZE);
> +       backend_mtl = (dev->dev_attrib.hw_max_sectors * dev->dev_attrib.block_size);
> +       mtl = min_not_zero(fabric_mtl, backend_mtl);
> +
>         if (cmd->data_length > mtl) {
>                 /*
>                  * If an existing CDB overflow is present, calculate new residual
>
Thank you for explanation!

This is definitely the way to go for max_sectors part.

However the queue depth part still confused me. Not sure if
dev->hw_queue_depth was honored.

--Sheng
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html