On 9/28/2020 1:07 PM, Shyam Sundar wrote:
I am open to removing the accounting against the "detecting" port for now, given that currently, there are no known implementations where the N_Port initiates the FPIN ELS. Let me know what you think.
Ok - let's not change counters on the "detecting" port.
I guess this is ok - but it makes it hard for administrators. I believe this is the list of the other nports (aka npiv) on the "attached port" that is generating the error. In that respect, it is correct to increment their counters - but I hope that an administrator knows that may resolve to a single physical port with only 1/N the error count. From our use case in linux, as an initiator, to match an rport it must be a target port using npiv and from our point of view we don't know that they are all sharing the same physical port. Shyam: I agree. But with the information in hand, I am not sure how we could do this better at this point.
Agree - we'll leave it as is.
Question: I know we've been asked to log the fpins to the kernel log. Holding on to the counts and so is good, but it still loses some of the relationship of the detected port (what detected what attached port). What's your thinking on it. Should it be something in these common routines and enabled/disabled by a sysfs toggle ? Shyam: So far, I have been looking at it from the point of gathering and maintain the error stats, closest to the source of their origin. So irrespective of if an error was "detected" by the Nx_Port itself, or by the F_Port attached to it, we are pointing the administrator towards the Nx_Port (by accounting for the error and tying it to that port). Having said that, I do not think I completely grasp the essence of your question here, and your proposal of turning it on/off. Could you please elaborate.
I'm saying that we have no idea who the "detecting" port was in all of the statistics. At least, by not counting the detecting port, we know that anything that has counters incrementing was generating the issue. I don't know how important it is to know the detecting port - if switch/fabric, it probably doesn't matter. If an NxPort, it may be interesting to know. We also have no idea if all the counter updates occurred in 1 fpin, or in N fpins. What I was suggesting was to log something like "FPIN <type> <detecting> <attached>", with one per descriptor type in the FPIN. We could default this logging off, and change a tunable to turn it on.
However, I feel like I'm trying to hard for this - so let's just ignore it. We can always add it in the future.
All the other comments make sense to me. I'll roll them in and send out another patchset shortly. Regards Shyam
Sounds good. Thanks -- james
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature