On Tue, May 15, 2018 at 04:41:00PM -0700, Roland Dreier wrote: > > This is basically what I was thinking as well.. > > > > My fear is that we have lots and lots of places where the > > cma_comp_exch "lock" isn't right in this same sort of way, and > > everything has to be audited.. > > OK, glad you like the approach. > > Yeah, I think at least any function that has a cma_comp_exch() to > unwind things in the error return path is suspect. For example, can > rdma_connect() race against rdma_set_ib_path()? I can take a more > comprehensive look tomorrow. The thing that gives me pause, is that I don't have a clear picture in my head how this this entire comp_exch based 'locking' scheme is even supposed to work. I just don't get it. It seems completely broken. Am I missing something? Did it start out OK and just get broken over the last decade? The answer matters, because if adding "ING"s states is the right design overall then, OK, but if the answer is this entire thing needs to go away and be replaced with a lock, then lets just do that instead. For instance, look at this.. int rdma_set_ib_path(struct rdma_cm_id *id, struct sa_path_rec *path_rec) { if (!cma_comp_exch(id_priv, RDMA_CM_ADDR_RESOLVED, RDMA_CM_ROUTE_RESOLVED)) return -EINVAL; id->route.path_rec = kmemdup(path_rec, sizeof(*path_rec), GFP_KERNEL); [..] err_free: kfree(id->route.path_rec); id->route.path_rec = NULL; err: cma_comp_exch(id_priv, RDMA_CM_ROUTE_RESOLVED, RDMA_CM_ADDR_RESOLVED); So this has the same error problem, but even on the success it is racy nightmare. eg I race the above between comp_exch and kmemdup with this: int rdma_connect(struct rdma_cm_id *id, struct rdma_conn_param *conn_param) { if (!cma_comp_exch(id_priv, RDMA_CM_ROUTE_RESOLVED, RDMA_CM_CONNECT)) return -EINVAL; And then link with this: static int cma_sidr_rep_handler(struct ib_cm_id *cm_id, struct ib_cm_event *ib_event) { mutex_lock(&id_priv->handler_mutex); if (id_priv->state != RDMA_CM_CONNECT) goto out; [..] ib_init_ah_attr_from_path(id_priv->id.device, id_priv->id.port_num, id_priv->id.route.path_rec, &event.param.ud.ah_attr); And go boom with a null pointer deref without requiring the error path, or I use the error path and get a more reliable NULL. Is there some subtle reason why these three calls can't happen concurrently? It was easy to find this combination, so I be there are lots more even if this particular avenue is blocked off. Basically, for what I can tell, the entire thing is total crazy - I don't think we can stomp syzkaller bugs one at a time and ever get anywhere when the entire FSM idea seems deeply flawed. This is why I didn't make a patch yet :) So, what is the solution here? How should this actually look? Is there some sensible design pattern with a multi-threaded comp_exch based FSM that I am not aware of? The most familiar answer is to replace comp_exch with a proper mutex across the entire state entry action and exit and forget about adding *INGS. Perhaps we could do that without too much invasion by doing something like this: if (!cma_state_check_enter(id_priv, RDMA_CM_IDLE, RDMA_CM_ADDR_BOUND)) [..] cma_state_success(id_priv); err: cma_state_rollback(id_priv); At least there is some kind of sane and auditable pattern for building this FSM.. I think I prefer this to adding a bunch of "ING"s.. The semantic would be that no cma_state_check_enter can succeed during the period between enter and success/rollback. Then every single function that does cma_exch gets audited and gains a success or rollback on its exit path, forming a good try_lock around the entire state action block. At the very least this guarentees that the FSM actions and arcs are sane for concurrency - that just leaves auditing the non-arc methods.. But, again, I don't know why it is like this, and I fear it will just cause other runtime bugs, as we now cause comp_exch to fail while any action is running, which opens a whole other avenue of auditing :( Saen? Parav? Any light on this? Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html