On 01/09/2015 07:00 PM, Michael Christie wrote: > > On Jan 8, 2015, at 11:03 PM, Nicholas A. Bellinger <nab@xxxxxxxxxxxxxxx> wrote: > >> On Thu, 2015-01-08 at 15:22 -0800, James Bottomley wrote: >>> On Thu, 2015-01-08 at 14:57 -0800, Nicholas A. Bellinger wrote: >>>> On Thu, 2015-01-08 at 14:29 -0800, James Bottomley wrote: >>>>> On Thu, 2015-01-08 at 14:16 -0800, Nicholas A. Bellinger wrote: >> >> <SNIP> >> >>>> The point is that a simple session wide counter for command sequence >>>> number assignment is significantly less overhead than all of the >>>> overhead associated with running a full multipath stack atop multiple >>>> sessions. >>> >>> I don't see how that's relevant to issue speed, which was the measure we >>> were using: The layers above are just a hopper. As long as they're >>> loaded, the MQ lower layer can issue at full speed. So as long as the >>> multipath hopper is efficient enough to keep the queues loaded there's >>> no speed degradation. >>> >>> The problem with a sequence point inside the MQ issue layer is that it >>> can cause a stall that reduces the issue speed. so the counter sequence >>> point causes a degraded issue speed over the multipath hopper approach >>> above even if the multipath approach has a higher CPU overhead. >>> >>> Now, if the system is close to 100% cpu already, *then* the multipath >>> overhead will try to take CPU power we don't have and cause a stall, but >>> it's only in the flat out CPU case. >>> >>>> Not to mention that our iSCSI/iSER initiator is already taking a session >>>> wide lock when sending outgoing PDUs, so adding a session wide counter >>>> isn't adding any additional synchronization overhead vs. what's already >>>> in place. >>> >>> I'll leave it up to the iSER people to decide whether they're redoing >>> this as part of the MQ work. >>> >> >> Session wide command sequence number synchronization isn't something to >> be removed as part of the MQ work. It's a iSCSI/iSER protocol >> requirement. >> >> That is, the expected + maximum sequence numbers are returned as part of >> every response PDU, which the initiator uses to determine when the >> command sequence number window is open so new non-immediate commands may >> be sent to the target. >> >> So, given some manner of session wide synchronization is required >> between different contexts for the existing single connection case to >> update the command sequence number and check when the window opens, it's >> a fallacy to claim MC/S adds some type of new initiator specific >> synchronization overhead vs. single connection code. > > I think you are assuming we are leaving the iscsi code as it is today. > > For the non-MCS mq session per CPU design, we would be allocating and > binding the session and its resources to specific CPUs. They would only > be accessed by the threads on that one CPU, so we get our > serialization/synchronization from that. That is why we are saying we > do not need something like atomic_t/spin_locks for the sequence number > handling for this type of implementation. > Wouldn't that need to be coordinated with the networking layer? Doesn't it do the same thing, matching TX/RX queues to CPUs? If so, wouldn't we decrease bandwidth by restricting things to one CPU? Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@xxxxxxx +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html