Hi James, On Wed, Apr 22, 2020 at 06:35:24PM -0700, James Smart wrote: > On 4/15/2020 8:51 AM, Hannes Reinecke wrote: > > Relating to my previous comments: Can you elaborate about the lifetime > > rules to the node? It looks as if any access to the node is being > > guarded by the main efc->lock. > > Which, taken to extremes, would meant that one has to hold that lock for > > _any_ access to the node; that would include I/O processing etc, too. > > I can't really see how that would scale to the IOPS you are pursuing, so > > clearly that will not happen. > > So how _do_ you ensure the validity of a node if the lock is not held? > > > > Cheers, > > > > Hannes > > This lock is defined for single EQ/RQ. For multi EQ/RQ which we will add in > future, will have more granular locks and performance improvements. > All the State Machine Events are triggered from the EQ processing thread > except vport creation and deletion. > > Locking: > Global lock per efc port "efc->lock" > > Domain: > efc_domain_cb --> (All the events are executed under efc lock) > > Async link up notification(EFC_HW_DOMAIN_FOUND) > Alloc Domain, Start discovery > Alloc Sport > Async link down notification (EFC_HW_DOMAIN_LOST) > Hold frames. (New frames will be moved to pending) > Post Sport Shutdown. > Post All Nodes Shutdown. > > Sport: > efc_lport_cb --> (All the events are executed under efc lock) > > HW sport registration (Mbox command responses) to complete sport > allocation. > > Vport : > efc_sport_vport_new() --> New Vport > > efc_vport_sport_alloc() (Protected by efc lock) > Alloc Sport, and start Sport SM > > efc_sport_vport_del() --> Delete Vport > > Post shutdown event to the Sport. (Protected by efc lock) > > Node: > efc_remote_node_cb() --> (All the events are executed under efc lock) > > HW node registration (Mbox command responses) to complete node > allocation. > > Node lifeCycle: > efc_domain_dispatch_frame:- > > EFC lock held > efc_node_find() IF not found, Create one. > when PLOGI is received, Register with hardware > Upon PRLI completion, setup a new session with LIO. > > IO path: > > Find sport and node under EFC lock. > If node can process commands, start IO processing. > Each IO is added to the node->active_ios. > > Node deletion: (RSCN notification or LOGO received ..) > > EFC lock held > Disable receiving fcp commands. (node->fcp_enabled) > Disable IO allocation for this node. > Remove LIO session. > Deregister node with HW > Waits for all active_ios to be completed/freed. > Purge pending IOs. > Free the Node. Thanks a lot for this. IMO, this should be added as documentation. > In IO path, EFC lock is acquired to find the sport and node, release the EFC > lock and continue with IO allocation and processing. Note: There is still an > unsafe area where we check for 'node->hold_frames" without the lock. Is this is the fast path? Would RCU help to avoid taking the lock at all? The usage pattern sounds like it would be a candidate for RCU. > Node is > assumed to be kept alive until all the IOs under the node are freed. Adding > the refcounting will remove this assumption. Thanks, Daniel