On Mon, Apr 26, 2021 at 10:11:07AM -0300, Jason Gunthorpe wrote: > On Mon, Apr 26, 2021 at 04:08:42PM +0300, Leon Romanovsky wrote: > > On Mon, Apr 26, 2021 at 09:03:49AM -0300, Jason Gunthorpe wrote: > > > On Sun, Apr 25, 2021 at 08:38:57PM +0300, Leon Romanovsky wrote: > > > > On Sun, Apr 25, 2021 at 02:22:54PM -0300, Jason Gunthorpe wrote: > > > > > On Sun, Apr 25, 2021 at 04:44:55PM +0300, Leon Romanovsky wrote: > > > > > > > > The proposed prepare/abort/finish flow is much harder to implement correctly. > > > > > > > > Let's take as an example ib_destroy_qp_user(), we called to rdma_rw_cleanup_mrs(), > > > > > > > > but didn't restore them after .destroy_qp() failure. > > > > > > > > > > > > > > I think it is a bug we call rdma_rw code in a a user path. > > > > > > > > > > > > It was an example of a flow that wasn't restored properly. > > > > > > The same goes for ib_dealloc_pd_user(), release of __internal_mr. > > > > > > > > > > > > Of course, these flows shouldn't fail because of being kernel flows, but it is not clear > > > > > > from the code. > > > > > > > > > > Well, exactly, user flows are not allowed to do extra stuff before > > > > > calling the driver destroy > > > > > > > > > > So the arrangement I gave is reasonable and make sense, it is > > > > > certainly better than the hodge podge of ordering that we have today > > > > > > > > I thought about simpler solution - move rdma_restrack_del() before .destroy() > > > > callbacks together with attempt to readd res object if destroy fails. > > > > > > Is isn't simpler, now add can fail and can't be recovered > > > > It is not different from failure during first call to rdma_restrack_add(). > > You didn't like the idea to be strict with addition of restrack, but > > want to be strict in reinsert. > > It is ugly we couldn't fix the add side, lets not repeat that uglyness > in other places Why can't we fix _add? Thanks > > Jason