On Mon, May 14, 2018 at 11:11:06AM +0300, Leon Romanovsky wrote: > From: Leon Romanovsky <leonro@xxxxxxxxxxxx> > > >From Parav: > > This series implements reference counts for the GID table entries. > > GID table entries for RoCE are based on IP addresses of netdevices and > default GID entries are based on GUID. > > For IB link layer they are based on SM and provided by the HCA to the > stack. These entries can get added/deleted/replaced at anytime in a host. > It is desirable to not delete the GID table entry while it is in use in > various software flows such as CM message processing, resolving destination > MAC and RDMACM message processing to name a few. > > It is also desirable to not replace a GID table entry while previous > GID entry is still being used in a software flow. In a GID table, > GID entries for RoCE can belong to multiple network namespaces. > Without holding a reference to it, a software flow might process a > CM message in two different network namespaces. > > This series is a prerequisite for enabling namespaces for RoCE. It ensures > that while GID entry being used, it cannot get deleted. When the last user > of the GID entry releases the reference, the GID entry is released in the > software cache and in the HCA. > > GID entries are maintained by ib_core in a software cache for IB, > RoCE and iWarp link layers. All GID search routines now refer to GID software > cache. This brings consistency in ULP modules and CM modules to have consistent > view of GID entries. Currently RoCE GID table entries are reference counted > using kref. IB GID table entries are not yet reference counted, however, > the software interface is common to both the link layers. GID entry > get/put/hold APIs for IB are no-op. > > Series consist of patches for following functionalities. > 1. Refactor code for cache handling of the RoCE GID entries > 2. New APIs for GID get/put/hold operations > 3. Changing existing APIs and its users to follow new API signature > > Series also removes dependency of net+ifindex in the path record entry > for RoCE. Both of those fields of the netdev can change and storing net > namespace without holding reference will lead to use-after-free crash. > Therefore it is removed. Netdevice information for RoCE will be provided > via referenced GID attribute in ib_cm_req entry in future. > > Summary of APIs: > 1. Caller who just wish to query the GID based on device, port and index > should call rdma_query_gid(). > > 2. Caller who wish to seach and use the GID and/or use GID attributes for > longer period of time in a software flow, should use rdma_find_* APIs. > > 3. Caller who knows the GID index and want to refer to the GID and/or > GID attributes should call rdma_get_gid_attr() and must call rdma_put_gid_attr(). > > 4. Caller with existing GID reference pointer should call rdma_hold_gid_attr() > if it wish to increment reference to the GID attribute. > > 5. Cached prefix is dropped from most GID related APIs as all the APIs provide > information from the cache. IB prefix is replaced with rdma to match to similar new > and existing APIs. > > Thanks > > Parav Pandit (12): > IB/cm: Avoid AV ah_attr overwriting during LAP message handling > IB/cm: Store and restore ah_attr during LAP msg processing > IB/cm: Store and restore ah_attr during CM message processing I applied the first three to for-next, the rest will need a v2.. Thanks, Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html