Hi John
As this is quite a political question I will answer it for you.
If the criticisms are narrowly interpreted as being about the front end we are running then this information implies that they were likely based on an incorrect interpretation of the issues. If however the criticisms are more broadly interpreted as about the whole approach of us taking responsibility for the systems integration rather than outsourcing to an integrated service provider, then this information does not change anything.
Jay
Sean,I think this is obvious but, just to check my understanding:this implies that all of the attacks on Meetecho forunreliability of their software during those incidents weremisguided or misdirected. Correct?thanks, john--On Thursday, November 19, 2020 01:56 +0000 Sean Croghan<sean@xxxxxxxxxxxxxxxx> wrote:As previously reported, we tracked down the cause of the interruption of the iabopen session to an issue with an unexpected Azure network interface removal event on network interfaces provisioned with SR-IOV. To prevent this happening again we intended to remove SR-IOV networking entirely. Unfortunately it now transpires that this change did not get applied to 2 of the 16 VMs including the application VM for the Plenary. So to add to the list of reasons to want 2020 to be over, towards the end of Plenary the same network interface removal event occurred and triggered an outage long enough to affect everyone.
I can confirm that the SR-IOV provisioning has now been removed from all VMs, which we believe eliminates the risk of the same thing happening again. We continue to work with Azure Direct Support to determine the underlying cause of the removal events.
Please let me know if you have any questions.
Sean
On Nov 17, 2020, at 4:56 PM, Sean Croghan wrote:
I have an update for those of you affected by the outage in yesterdays IABOPEN session. We have isolated this to a interrupt to the virtual machines network interface. We currently have no explanation for this outage. We have engaged the hardware and network team with Azure to determine the cause of this event but do not have an explanation at this time.
I will provide an update when we have received more information.
For those interested in details:
At 07:56:36 UTC the network interface (eth0) went link down and the interface was removed from the VM At 08:00:28 UTC then a new interface was added to the VM At 08:00:29 UTC (eth1) went link up
Yes the VM added a new interface. The servers were provisioned with SR-IOV and we suspect that a migration event occurred that moved the VM to different hardware causing the NIC driver to be reloaded. We have found some evidence that would support our theory that a migration or unscheduled maintenance event occurred and are working to verify if that happened during this event. We have removed SR-IOV from the network interfaces on all servers.
I hope you are having a good and productive week
— The IEFT NOC Team
-- 109all mailing list 109all@xxxxxxxx<mailto:109all@xxxxxxxx> https://www.ietf.org/mailman/listinfo/109all
|