On Fri, Jul 17, 2015 at 10:02 PM, Jason Gunthorpe <jgunthorpe@xxxxxxxxxxxxxxxxxxxx> wrote: > On Wed, Jun 24, 2015 at 03:59:21PM +0300, Matan Barak wrote: >> + >> + /* in rdma_cap_roce_gid_table, this funciton should be protected by a >> + * sleep-able lock. >> + */ >> + write_lock(&table->data_vec[ix].lock); > > I'm having a hard time understanding this comment > The same function is used for both RoCE and IB. Since RoCE unlocks the rwlock, calls the vendor's callback and locks it again. If two write_gid(s) are executed simultaneously, you need to protect them from writing to the same entry. The vendor's callback might sleep so we require a sleep-able lock. >> +int ib_cache_gid_add(struct ib_device *ib_dev, u8 port, >> + union ib_gid *gid, struct ib_gid_attr *attr) >> +{ >> + struct ib_gid_table **ports_table = >> + READ_ONCE(ib_dev->cache.gid_cache); >> + /* all table reads depend on ports_table, no need for smp_rmb() */ >> + if (!ports_table) >> + return -EOPNOTSUPP; > > This common pattern does look genuinely odd... > > The gid_cache is part of the common API, it really shouldn't be kfree'd > while held struct ib_devices are around. The kfree for all the cache.c > stuff should probably be called from ib_device_release, not from the > client release. > > That is actually something the current code does that is possibly > wrong. It is trivially fixed by moving all the kfrees to > ib_device_release. > But cache as a whole is implemented as a client (cache_client). Isn't it a bit odd to free a client in ib_device_release? > Next is the READ_ONCE fencing. I think it is totally unnecessary. > > Patch #4 does this: > > down_write(&lists_rwsem); > list_del(&device->core_list); > up_write(&lists_rwsem); > > list_for_each_entry_reverse(client, &client_list, list) > if (client->remove) > client->remove(device); > > So, by the time we get to gid_table_client_cleanup_one, it is no > longer possible for ib_enum_all_roce_netdevs to use the ib_device we > are removing (it is taken off the core_list). > > Since all the queued work calls ib_enum_all_roce_netdevs, it is > impossibile for something like ib_cache_gid_add to be called from the > work queue with the ib_dev under removal. > > In fact, even the flush_work is not needed because of how lists_rwsem > is being used: we can not remove something from the core list until > there are no ib_enum_all_roce_netdevs callbacks running. > Correct, it's no longer needed when rwlock protects the list. Thanks for pointing this out. > Also, did you notice the double flush of the work queue? One is > enough: > > static void ib_cache_cleanup_one(struct ib_device *device) > { > ib_unregister_event_handler(&device->cache.event_handler); > flush_workqueue(ib_wq); > gid_table_client_cleanup_one(device); > static void gid_table_client_cleanup_one(struct ib_device *ib_dev) > { > flush_workqueue(ib_wq); > > Correct, I'll fix that. > No other locking problems screamed out at me, but it is a big patch, > and I have't looked closely at all of it. > Thanks for the review. I'll fix those issues. Matan > Jason > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html