Re: [Lsf-pc] [LSF/MM TOPIC] iSCSI MQ adoption via MCS discussion

Hannes Reinecke <hare@xxxxxxx> · Fri, 09 Jan 2015 19:28:13 +0100



On 01/09/2015 07:00 PM, Michael Christie wrote:
> 
> On Jan 8, 2015, at 11:03 PM, Nicholas A. Bellinger <nab@xxxxxxxxxxxxxxx> wrote:
> 
>> On Thu, 2015-01-08 at 15:22 -0800, James Bottomley wrote:
>>> On Thu, 2015-01-08 at 14:57 -0800, Nicholas A. Bellinger wrote:
>>>> On Thu, 2015-01-08 at 14:29 -0800, James Bottomley wrote:
>>>>> On Thu, 2015-01-08 at 14:16 -0800, Nicholas A. Bellinger wrote:
>>
>> <SNIP>
>>
>>>> The point is that a simple session wide counter for command sequence
>>>> number assignment is significantly less overhead than all of the
>>>> overhead associated with running a full multipath stack atop multiple
>>>> sessions.
>>>
>>> I don't see how that's relevant to issue speed, which was the measure we
>>> were using: The layers above are just a hopper.  As long as they're
>>> loaded, the MQ lower layer can issue at full speed.  So as long as the
>>> multipath hopper is efficient enough to keep the queues loaded there's
>>> no speed degradation.
>>>
>>> The problem with a sequence point inside the MQ issue layer is that it
>>> can cause a stall that reduces the issue speed. so the counter sequence
>>> point causes a degraded issue speed over the multipath hopper approach
>>> above even if the multipath approach has a higher CPU overhead.
>>>
>>> Now, if the system is close to 100% cpu already, *then* the multipath
>>> overhead will try to take CPU power we don't have and cause a stall, but
>>> it's only in the flat out CPU case.
>>>
>>>> Not to mention that our iSCSI/iSER initiator is already taking a session
>>>> wide lock when sending outgoing PDUs, so adding a session wide counter
>>>> isn't adding any additional synchronization overhead vs. what's already
>>>> in place.
>>>
>>> I'll leave it up to the iSER people to decide whether they're redoing
>>> this as part of the MQ work.
>>>
>>
>> Session wide command sequence number synchronization isn't something to
>> be removed as part of the MQ work.  It's a iSCSI/iSER protocol
>> requirement.
>>
>> That is, the expected + maximum sequence numbers are returned as part of
>> every response PDU, which the initiator uses to determine when the
>> command sequence number window is open so new non-immediate commands may
>> be sent to the target.
>>
>> So, given some manner of session wide synchronization is required
>> between different contexts for the existing single connection case to
>> update the command sequence number and check when the window opens, it's
>> a fallacy to claim MC/S adds some type of new initiator specific
>> synchronization overhead vs. single connection code.
> 
> I think you are assuming we are leaving the iscsi code as it is today.
> 
> For the non-MCS mq session per CPU design, we would be allocating and
> binding the session and its resources to specific CPUs. They would only
> be accessed by the threads on that one CPU, so we get our
> serialization/synchronization from that. That is why we are saying we
> do not need something like atomic_t/spin_locks for the sequence number
> handling for this type of implementation.
> 
Wouldn't that need to be coordinated with the networking layer?
Doesn't it do the same thing, matching TX/RX queues to CPUs?
If so, wouldn't we decrease bandwidth by restricting things to one CPU?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@xxxxxxx			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html