On 6/15/24 03:10, Stephen Hemminger wrote: > [You don't often get email from stephen@xxxxxxxxxxxxxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] > > On Thu, 13 Jun 2024 11:22:02 +0300 > Omer Shpigelman <oshpigelman@xxxxxxxxx> wrote: > >> +static int hbl_en_ports_reopen(struct hbl_aux_dev *aux_dev) >> +{ >> + struct hbl_en_device *hdev = aux_dev->priv; >> + struct hbl_en_port *port; >> + int rc = 0, i; >> + >> + for (i = 0; i < hdev->max_num_of_ports; i++) { >> + if (!(hdev->ports_mask & BIT(i))) >> + continue; >> + >> + port = &hdev->ports[i]; >> + >> + /* It could be that the port was shutdown by 'ip link set down' and there is no need >> + * in reopening it. >> + * Since we mark the ports as in reset even if they are disabled, we clear the flag >> + * here anyway. >> + * See hbl_en_ports_stop_prepare() for more info. >> + */ >> + if (!netif_running(port->ndev)) { >> + atomic_set(&port->in_reset, 0); >> + continue; >> + } >> + > > Rather than duplicating network device state in your own flags, it would be better to use > existing infrastructure. Read Documentation/networking/operstates.rst > > Then you could also get rid of the kludge timer stuff in hbl_en_close(). > I think that additional explanation is needed here. In addition to netdev flows, we also support an internal reset flow (that's what the in_reset boolean indicates). Our NIC driver is an extension of the compute driver (they share the same HW) and a reset flow might be needed due to some compute operation which is entirely unrelated to the NIC driver. But we must not access the HW while this reset flow is executed. Note that this internal reset flow originates from the compute driver and hence we might have parallel netdev operations that should be blocked meanwhile. The internal reset flow has 2 phases - teardown and re-init. This reopen function is called during the re-init phase to restore the NIC ports, but only if they were actually opened prior to the reset flow. Regarding hbl_en_close() - during the port close we need to write to the HW so due to the explanation above, also there we should wait for an existing internal reset flow to finish first. Let me know if that's clear enough and addresses your concerns.