Re: [PATCH v9 07/13] lpfc: vmid: Implements ELS commands for appid patch

Benjamin Block <bblock@xxxxxxxxxxxxx> · Thu, 22 Apr 2021 11:28:51 +0200

On Wed, Apr 21, 2021 at 03:55:15PM -0700, James Smart wrote:
> On 4/20/2021 5:38 AM, Benjamin Block wrote:
> ...
> > > +	len = *((u32 *)(pcmd + 4));
> > > +	len = be32_to_cpu(len);
> > > +	memcpy(vport->qfpa_res, pcmd, len + 8);
> > > +	len = len / LPFC_PRIORITY_RANGE_DESC_SIZE;
> > > +
> > > +	desc = (struct priority_range_desc *)(pcmd + 8);
> > > +	vmid_range = vport->vmid_priority.vmid_range;
> > > +	if (!vmid_range) {
> > > +		vmid_range = kcalloc(MAX_PRIORITY_DESC, sizeof(*vmid_range),
> > > +				     GFP_KERNEL);
> > > +		if (!vmid_range) {
> > > +			kfree(vport->qfpa_res);
> > > +			goto out;
> > > +		}
> > > +		vport->vmid_priority.vmid_range = vmid_range;
> > > +	}
> > > +	vport->vmid_priority.num_descriptors = len;
> > > +
> > > +	for (i = 0; i < len; i++, vmid_range++, desc++) {
> > > +		lpfc_printf_vlog(vport, KERN_DEBUG, LOG_ELS,
> > > +				 "6539 vmid values low=%d, high=%d, qos=%d, "
> > > +				 "local ve id=%d\n", desc->lo_range,
> > > +				 desc->hi_range, desc->qos_priority,
> > > +				 desc->local_ve_id);
> > > +
> > > +		vmid_range->low = desc->lo_range << 1;
> > > +		if (desc->local_ve_id == QFPA_ODD_ONLY)
> > > +			vmid_range->low++;
> > > +		if (desc->qos_priority)
> > > +			vport->vmid_flag |= LPFC_VMID_QOS_ENABLED;
> > > +		vmid_range->qos = desc->qos_priority;
> > 
> > I'm curios, if the FC-switch signals it supports QoS for a range here, how
> > exactly interacts this with the VM IDs that you seem to allocate
> > dynamically during runtime for cgroups that request specific App IDs?
> > You don't seem to use `LPFC_VMID_QOS_ENABLED` anywhere else in the
> > series. >
> > Would different cgroups get different QoS classes/guarantees depending
> > on the selected VM ID (higher VM ID gets better QoS class, or something
> > like that?)? Would the tagged traffic be handled differently than the
> > ordinary traffic in the fabric?
> 
> The simple answer is there is no interaction w/ the cgroup on priority.
> And no- we really don't look or use it.  The ranges don't really have hard
> priority values. The way it works is that all values within a range is
> equal; a value in the first range is "higher priority" than a value in the
> second range; and a value in the second range is higher than those in the
> third range, and so on. 

Ah. That's interesting. I thought it is like that, but wasn't sure from
the spec. Thanks for clarifying.

> Doesn't really matter whether the range was marked
> Best Effort or H/M/L. There's no real "weight".
> 
> What you see is the driver simply recording the different ranges so that it
> knows what to allocate from later on. The driver creates a flat bitmap of
> all possible values (max of 255) from all ranges - then will allocate values
> on a first bit set basis.  I know at one point we were going to only
> auto-assign if there was 1 range, and if multiple range was going to defer a
> mgmt authority to tell us which range, but this obviously doesn't do that.

I was worrying a bit whether this would create some hard to debug
problems in the wild, when QoS essentially depends on the order in which
Applications/Containers are started and get IDs assigned accordingly -
assuming there is multiple priority ranges.

> Also... although this is coded to support the full breadth of what the
> standard allows, it may well be the switch only implements 1 range in
> practice.
> 

-- 
Best Regards, Benjamin Block  / Linux on IBM Z Kernel Development / IBM Systems
IBM Deutschland Research & Development GmbH    /    https://www.ibm.com/privacy
Vorsitz. AufsR.: Gregor Pillen         /        Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294