On 26-Nov-18 09:48, Leon Romanovsky wrote: > From: Parav Pandit <parav@xxxxxxxxxxxx> > > Describe core documentation for ib_device_mutex, ib_list_rwsem, > net_rwsem and compat_dev_sem. > > Signed-off-by: Parav Pandit <parav@xxxxxxxxxxxx> > Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx> > --- > Documentation/infiniband/core_devices.txt | 73 ++++++++++++++++++++++- > 1 file changed, 70 insertions(+), 3 deletions(-) > > diff --git a/Documentation/infiniband/core_devices.txt b/Documentation/infiniband/core_devices.txt > index ff28def49ff4..fd98b27dd3d9 100644 > --- a/Documentation/infiniband/core_devices.txt > +++ b/Documentation/infiniband/core_devices.txt > @@ -51,7 +51,7 @@ All ib_core_device(s) points to one owner ib_device using driver_data. > | | | | > | | | | > | +----------+ | (init_net) > - | | ib_device > + | *net | ib_device > | *owner-------------------------+------>+--------------------+<--+ > +--------------+ | | | | > | | ib_core_device | | > @@ -63,12 +63,12 @@ All ib_core_device(s) points to one owner ib_device using driver_data. > +--------------+ | | | | | | | | > | | | | | | | | | | > | device | | | | +----------+ | | | > - | +----------+ | | | | | | | > + | +----------+ | | | | *net | | | > | | | | | | | *owner--------------+ > | | | | | | +--------------+ | > | | | | | +--------------------+ > | +----------+ | | > - | | | > + | *net | | > | *owner------------------------+ > +--------------+ > > @@ -96,6 +96,73 @@ ib_device > | | | | | | > | | +----------+ | | > | | | | > +| | *net | | > | | *owner | | > | +--------------+ | > +--------------------+ > + > +2.3 locking scheme > +-------------------------------------------------------- > +There are three locks involved to provide synchronization between four > +operations. > +These four operations are > +(a) device addition using ib_register_device() > +(b) device removal using ib_unregister_device() > +(c) net namespace addition using _init_net() notifier > +(d) net namespace removal using _exit_net() notifier > + > +ib_register_device() and ib_unregister_device() work on all net namespaces > +to add/remove compat devices. Therefore, they need to hold net_rwsem read > +lock so that net namespace doesn't disappear while this compat devices > +add/remove occurs. > + > +Multiple rdma devices of same or different vendors can be enumerated in > +parallel trying to add/remove compat devices for a net namespace. > +Therefore, protect compat device list operations using compat_rwsem read/write > +semaphore. > +Even though multiple device enumration is currently guarded using Hi Leon, "enumration" -> "enumeration" > +ib_device_mutex, it is right to not depend on this lock for synchronization for > +adding/removing compat devices to a net namespace list. > +This will allow ib_device_mutex related changes without changing compat device > +locking scheme. > + > +_init_net() and _exit_net() callbacks work on individual net namespace. > +Therefore, they only need to work on net namespace specific dev_list > +without net_rwsem lock. > + > +Once an rdma device is registered and available in device list, net namespace > +enumeration routines init_net() and exit_net() can enumerate the rdma devices > +to add/remove their compat devices. net_rwsem is not locked while net core > +invokes these notifiers, therefore rely on ib_device_mutex and list_rwsem, so > +that only only one flow can add/remove compat devices. "only only" -> "only". > +ib_device_mutex is chosen to synchronize between device enumeration routines > +and net enumeration compare to net_rwsem because net_rwsem is maintly for "maintly" -> "mainly". > +protecting net addition to the global list. > + > +With above scheme, the generic lock hirerchy among above 4 code flow is: "hirerchy" -> "hierarchy". > + > +level-1: ib_device_mutex > +level-2: ib_list_rwsem (optional, only read locked) > +level-3: net_rwsem (optional, only read locked) > +level-4: compat_rwsem > + > +2.3.1 locks in device enumeration path > +-------------------------------------- > +level-1: ib_device_mutex > +level-2: ib_list_rwsem (read locked) > +level-3: net_rwsem (read locked) > +level-4: compat_rwsem > + > +2.3.2 locks in net enumeration path > +----------------------------------- > + > +level-1: ib_device_mutex > +level-2: ib_list_rwsem (read locked) > +level-3: compat_rwsem > + > +2.3.3 locks in device rename path > +----------------------------------- > + > +level-1: rdma_nl_mutex > +level-2: ib_device_mutex > +level-3: compat_rwsem >