> >> + pr_debug("%s: Wait for user context to quiesce...\n", __func__); > >> + wake_up_all(&cfg->limbo_waitq); > >> + ssleep(1); > > > > Why 1 sec and why in a loop? Can’t you poll/wait for completion somewhere? > > This routine is called when the device is being removed and we need to stall until > the user contexts are made aware of removal [by marking them in error] and have > completely gone away [there are no longer any contexts in our table or list]. > > The 1 second sleep is simply to give the users time to cleanup once they see > the notification. We can make it a larger/smaller value or remove it entirely, but > I felt that since this is not a hot path there was no reason to perform a busy-wait > style of loop here and yield to someone else. > > As for the completion/poll, I did consider that but couldn’t come up with a clean > way of implementing given how we designed the notification/cleanup mechanism > (really just a failed recovery). We can certainly look at doing that as an > enhancement in the future. Does the API allow this flexibility in the future? > > >> + if (likely(ctxid < MAX_CONTEXT)) { > >> +retry: > >> + rc = mutex_lock_interruptible(&cfg->ctx_tbl_list_mutex); > >> + if (rc) > >> + goto out; > >> + > >> + ctxi = cfg->ctx_tbl[ctxid]; > >> + if (ctxi) > >> + if ((file && (ctxi->file != file)) || > >> + (!file && (ctxi->ctxid != rctxid))) > >> + ctxi = NULL; > >> + > >> + if ((ctx_ctrl & CTX_CTRL_ERR) || > >> + (!ctxi && (ctx_ctrl & CTX_CTRL_ERR_FALLBACK))) > >> + ctxi = find_error_context(cfg, rctxid, file); > >> + if (!ctxi) { > >> + mutex_unlock(&cfg->ctx_tbl_list_mutex); > >> + goto out; > >> + } > >> + > >> + /* > >> + * Need to acquire ownership of the context while still under > >> + * the table/list lock to serialize with a remove thread. Use > >> + * the 'try' to avoid stalling the table/list lock for a single > >> + * context. > >> + */ > >> + rc = mutex_trylock(&ctxi->mutex); > >> + mutex_unlock(&cfg->ctx_tbl_list_mutex); > >> + if (!rc) > >> + goto retry; > > > > Please just create a loop rather than this goto retry. > > Okay. > > >> + rhte_checkin(ctxi, rhte); > >> + cxlflash_lun_detach(gli); > >> + > >> +out: > >> + if (unlock_ctx) > >> + mutex_unlock(&ctxi->mutex); > > > > Where is the matching lock for this? > > One of the main design points of our context serialization strategy is that > any context returned from get_context() has been fully validated, will not > be removed from under you, and _is holding the context mutex_. Thus > for each of these mutex_unlock(ctxi) you see at the bottom of external > entry points, the mutex was obtained in get_context(). Should we have something like put_context() that does this? So there is matching calls to get/put_context > > >> + char *tmp = NULL; > >> + size_t size; > >> + struct afu *afu = cfg->afu; > >> + struct ctx_info *ctxi = NULL; > >> + struct sisl_rht_entry *rhte; > >> + > >> + size = (MAX_RHT_PER_CONTEXT * sizeof(*ctxi->rht_lun)); > >> + size += sizeof(*ctxi); > >> + > >> + tmp = kzalloc(size, GFP_KERNEL); > > > > Just do two allocs. One for ctxi and one for rht_lun. This is overly > > complicated. > > I disagree that it’s overly complicated. The intention here was to get this > stuff together to avoid multiple function calls and possibly help out perf-wise > via locality. That said, I’m not opposed to performing multiple allocations. I'm not sure I buy it's a performance issue on create_context(). How often are you calling that? Doesn't that do a lot of things other than just this? Anyway if you want to do that, that's ok I guess, but you need to document why and what you're doing. > Look for this in v5. > > > > >> + if (unlikely(!tmp)) { > >> + pr_err("%s: Unable to allocate context! (%ld)\n", > >> + __func__, size); > >> + goto out; > >> + } > >> + > >> + rhte = (struct sisl_rht_entry *)get_zeroed_page(GFP_KERNEL); > >> + if (unlikely(!rhte)) { > >> + pr_err("%s: Unable to allocate RHT!\n", __func__); > >> + goto err; > >> + } > >> + > >> + ctxi = (struct ctx_info *)tmp; > >> + tmp += sizeof(*ctxi); > >> + ctxi->rht_lun = (struct llun_info **)tmp; > > > > Yuck… just do two allocs rather than this throbbing. > > You got it. > > >> +out: > >> + if (unlock_ctx) > >> + mutex_unlock(&ctxi->mutex); > > > > Where is the matching lock for this? > > See earlier comment about get_context(). > > >> + if (likely(ctxi)) > >> + mutex_unlock(&ctxi->mutex); > > > > Where is the matching lock for this? > > Ditto. > > >> + file = cxl_get_fd(ctx, &cfg->cxl_fops, &fd); > > > > You should create a new fops for each call here. We write the fops to fill it > > out. I think it’ll work as you have now but it's a bit dodgy. > > Are you saying that instead of having a single fops for the device, each of our > context’s should have an fops that we pass here? > > >> + list_add(&lun_access->list, &ctxi->luns); > >> + mutex_unlock(&ctxi->mutex); > > > > Where is the matching lock for this? > > Just like with get_context(), we leave create_context() with the mutex held. > > >> + rc = cxlflash_lun_attach(gli, MODE_PHYSICAL); > >> + if (unlikely(rc)) { > >> + dev_err(dev, "%s: Failed to attach to LUN! (PHYSICAL)\n", > > > > Is this going to spam the console from userspace? Same below. > > With a well-behaved application it shouldn’t We can certainly look at moving > to a debug if you feel it’s likely that someone would use this to spam the > console. > > >> + struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)sdev->host->hostdata; > >> + struct afu *afu = cfg->afu; > >> + struct dk_cxlflash_hdr *hdr; > >> + char buf[MAX_CXLFLASH_IOCTL_SZ]; > > > > why is buf not just a “union cxlflash_ioctls"? > > Because I wanted you to have to look up this define? ;) :-P > In all seriousness, we originally had this define as a stand-alone value (it > wasn’t tied to the union). When I implemented the union, I figured the define > was more self-descriptive in what we were trying to convey. I’ll change it > over to the union in v5. > > >> + hdr = (struct dk_cxlflash_hdr *)&buf; > >> + if (hdr->version != 0) { > >> + pr_err("%s: Version %u not supported for %s\n", > >> + __func__, hdr->version, decode_ioctl(cmd)); > >> + rc = -EINVAL; > >> + goto cxlflash_ioctl_exit; > >> + } > > > > Do you advertise this version anywhere? Users just have to call it and fail? > > We don’t. We can add a version define to the exported ioctl header. It needs to be exported dynamically by the kernel. Not the headers. Think new software on old kernel and visa versa. > > > You should check hdr->flags are zero incase some new userspace tries to set > > them. Same for hdr->rsvd. > > We can’t do that for the flags because those they are used for some of the ioctls. > The reserved’s however _can_ be checked under the version 0 clause. OK > > Also, can you do these checks earlier. It seems you've already done a bunch of > > stuff before here. > > From a runtime perspective, we haven’t actually done that much prior to the version > check. I suppose we could put the check earlier but I don’t like the idea of touching > data that hasn’t been copied in. So we would need to copy in just the header, then > check the version. Then if it’s valid, copy in the rest. At some point later on if/when more > versions are supported we might need to do something like this, but I think it seems a > bit silly to do that now. Ok. > >> +#define DK_CXLFLASH_ATTACH CXL_IOWR(0x80, dk_cxlflash_attach) > >> +#define DK_CXLFLASH_USER_DIRECT CXL_IOWR(0x81, dk_cxlflash_udirect) > >> +#define DK_CXLFLASH_RELEASE CXL_IOWR(0x84, dk_cxlflash_release) > >> +#define DK_CXLFLASH_DETACH CXL_IOWR(0x85, dk_cxlflash_detach) > >> +#define DK_CXLFLASH_VERIFY CXL_IOWR(0x86, dk_cxlflash_verify) > >> +#define DK_CXLFLASH_RECOVER_AFU CXL_IOWR(0x88, dk_cxlflash_recover_afu) > >> +#define DK_CXLFLASH_MANAGE_LUN CXL_IOWR(0x89, dk_cxlflash_manage_lun) > > > > I'm not sure I'd leave these sparse. What happens if the vlun patches don't > > get in? > > I do agree with you. I had wanted to keep them like this as their ordering matched > closer with how they are expected to be used. That said, I’m okay with moving > them around to avoid the sparseness. YEah, plus it breaks your table in the ioctl Mikey -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html