Hello Hannes,
Thanks for responding.
On Wed, 2021-03-31 at 09:25 +0200, Hannes Reinecke wrote:
Hi Erwin,On 3/31/21 2:22 AM, Erwin van Londen wrote:Hello Muneendra, benjamin,The fpin options that are developed do have a whole plethora of optionsand do not mainly trigger paths being in a marginal state. Th mpio layercould utilise the various triggers like congestion and latency and notjust use a marginal state as a decisive point. If a path is somewhatcongested the amount of io's dispersed over these paths could just bereduced by a flexible margin depending on how often and which fpins areactually received. If for instance and fpin is recieved that an upstreamport is throwing physical errors you may exclude is entirely fromqueueing IO's to it. If it is a latency related problem where creditshortages come in play you may just need to queue very small IO's to it.The scsi CDB will tell the size of the IO. Congestion notifications mayjust be used for potentially adding an artificial delay to reduce theworkload on these paths and schedule them on another.As correctly noted, FPINs come with a variety of options.And I'm not certain we can everything correctly; a degraded path issimple, but for congestion there is only _so_ much we can do.The typical cause for congestion is, say, a 32G host port talking to a16G (or even 8G) target port _and_ a 32G target port.
Congestion can also be caused by a change in workload characteristics where, for example, read and write workload start interfering. The funnel principle would not apply in that case.
So the host cannot 'tune down' it's link to 8G; doing so would impactperformance on the 32G target port.(And we would suffer reverse congestion whenever that target port sendsframes).And throttling things on the SCSI layer only helps _so_ much, as thereal congestion is due to the speed with which the frames are sequencedonto the wire. Which is not something we from the OS can control.
If you can interleave IOs with an artificial delay depending on the type and frequency these FPINS arrive you would be able to prevent latency buildup in the san.
From another POV this is arguably a fabric mis-design; so it _could_ bealleviated by separating out the ports with lower speeds into its ownzone (or even on a separate SAN); that would trivially make thecongestion go away.
The entire FPIN concept was designed to be able to provide clients with the option to respond and react to changing behaviours in sans. A mis-design is often not really the case but ongoing changes and continuous provisioning is mainly contributing to the case.
But for that the admin first should be _alerted_, and this really is myprimary goal: having FPINs showing up in the message log, to alert theadmin that his fabric is not performing well.
I think the FC drivers are already having facilities to do that or they will have that shortly. dm-multipath is not really required to handle the notifications but would be useful if actions have been done based on fpins.
A second step will be to massaging FPINs into DM multipath, and have itinfluencing the path priority or path status. But this is currentlyunder discussion how it could be integrated best.
OK
Not really sure what the possibilities are from a DM-Multipathviewpoint, but I feel if the OS options are not properly aligned withwhat the FC protocol and HBA drivers are able to provide we may miss agood opportunity to optimize the dispersion of IO's and improve overallperformance.Looking at the size of the commands is one possibility, but at this timethis presumes too much on how we _think_ FPINs will be generated.I'd rather do some more tests to figure out under which circumstances wecan expect which type of FPINs, and then start looking for ways on howto integrate them.
The FC protocol only describes the framework and not the values that need to be adhered to. That depends on the end devices and their capabilities.
Cheers,Hannes
-- dm-devel mailing list dm-devel@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/dm-devel