Re: [PATCH v4 2/3] cxlflash: Superpipe support

Michael Neuling <mikey@xxxxxxxxxxx> · Wed, 12 Aug 2015 13:54:35 +1000

> >> +		pr_debug("%s: Wait for user context to quiesce...\n", __func__);
> >> +		wake_up_all(&cfg->limbo_waitq);
> >> +		ssleep(1);
> > 
> > Why 1 sec and why in a loop?  Can’t you poll/wait for completion somewhere?
> 
> This routine is called when the device is being removed and we need to stall until
> the user contexts are made aware of removal [by marking them in error] and have
> completely gone away [there are no longer any contexts in our table or list].
> 
> The 1 second sleep is simply to give the users time to cleanup once they see
> the notification. We can make it a larger/smaller value or remove it entirely, but
> I felt that since this is not a hot path there was no reason to perform a busy-wait
> style of loop here and yield to someone else.
> 
> As for the completion/poll, I did consider that but couldn’t come up with a clean
> way of implementing given how we designed the notification/cleanup mechanism
> (really just a failed recovery). We can certainly look at doing that as an
> enhancement in the future.

Does the API allow this flexibility in the future?

> 
> >> +	if (likely(ctxid < MAX_CONTEXT)) {
> >> +retry:
> >> +		rc = mutex_lock_interruptible(&cfg->ctx_tbl_list_mutex);
> >> +		if (rc)
> >> +			goto out;
> >> +
> >> +		ctxi = cfg->ctx_tbl[ctxid];
> >> +		if (ctxi)
> >> +			if ((file && (ctxi->file != file)) ||
> >> +			    (!file && (ctxi->ctxid != rctxid)))
> >> +				ctxi = NULL;
> >> +
> >> +		if ((ctx_ctrl & CTX_CTRL_ERR) ||
> >> +		    (!ctxi && (ctx_ctrl & CTX_CTRL_ERR_FALLBACK)))
> >> +			ctxi = find_error_context(cfg, rctxid, file);
> >> +		if (!ctxi) {
> >> +			mutex_unlock(&cfg->ctx_tbl_list_mutex);
> >> +			goto out;
> >> +		}
> >> +
> >> +		/*
> >> +		 * Need to acquire ownership of the context while still under
> >> +		 * the table/list lock to serialize with a remove thread. Use
> >> +		 * the 'try' to avoid stalling the table/list lock for a single
> >> +		 * context.
> >> +		 */
> >> +		rc = mutex_trylock(&ctxi->mutex);
> >> +		mutex_unlock(&cfg->ctx_tbl_list_mutex);
> >> +		if (!rc)
> >> +			goto retry;
> > 
> > Please just create a loop rather than this goto retry.
> 
> Okay.
> 
> >> +	rhte_checkin(ctxi, rhte);
> >> +	cxlflash_lun_detach(gli);
> >> +
> >> +out:
> >> +	if (unlock_ctx)
> >> +		mutex_unlock(&ctxi->mutex);
> > 
> > Where is the matching lock for this?
> 
> One of the main design points of our context serialization strategy is that
> any context returned from get_context() has been fully validated, will not
> be removed from under you, and _is holding the context mutex_. Thus
> for each of these mutex_unlock(ctxi) you see at the bottom of external
> entry points, the mutex was obtained in get_context().

Should we have something like put_context() that does this?  So there is
matching calls to get/put_context

> 
> >> +	char *tmp = NULL;
> >> +	size_t size;
> >> +	struct afu *afu = cfg->afu;
> >> +	struct ctx_info *ctxi = NULL;
> >> +	struct sisl_rht_entry *rhte;
> >> +
> >> +	size = (MAX_RHT_PER_CONTEXT * sizeof(*ctxi->rht_lun));
> >> +	size += sizeof(*ctxi);
> >> +
> >> +	tmp = kzalloc(size, GFP_KERNEL);
> > 
> > Just do two allocs. One for ctxi and one for rht_lun.  This is overly
> > complicated.
> 
> I disagree that it’s overly complicated. The intention here was to get this
> stuff together to avoid multiple function calls and possibly help out perf-wise
> via locality. That said, I’m not opposed to performing multiple allocations.

I'm not sure I buy it's a performance issue on create_context().  How
often are you calling that?  Doesn't that do a lot of things other than
just this?

Anyway if you want to do that, that's ok I guess, but you need to
document why and what you're doing. 

> Look for this in v5.
> 
> > 
> >> +	if (unlikely(!tmp)) {
> >> +		pr_err("%s: Unable to allocate context! (%ld)\n",
> >> +		       __func__, size);
> >> +		goto out;
> >> +	}
> >> +
> >> +	rhte = (struct sisl_rht_entry *)get_zeroed_page(GFP_KERNEL);
> >> +	if (unlikely(!rhte)) {
> >> +		pr_err("%s: Unable to allocate RHT!\n", __func__);
> >> +		goto err;
> >> +	}
> >> +
> >> +	ctxi = (struct ctx_info *)tmp;
> >> +	tmp += sizeof(*ctxi);
> >> +	ctxi->rht_lun = (struct llun_info **)tmp;
> > 
> > Yuck… just do two allocs rather than this throbbing.
> 
> You got it.
> 
> >> +out:
> >> +	if (unlock_ctx)
> >> +		mutex_unlock(&ctxi->mutex);
> > 
> > Where is the matching lock for this?
> 
> See earlier comment about get_context().
> 
> >> +	if (likely(ctxi))
> >> +		mutex_unlock(&ctxi->mutex);
> > 
> > Where is the matching lock for this?
> 
> Ditto.
> 
> >> +	file = cxl_get_fd(ctx, &cfg->cxl_fops, &fd);
> > 
> > You should create a new fops for each call here.  We write the fops to fill it
> > out.  I think it’ll work as you have now but it's a bit dodgy.
> 
> Are you saying that instead of having a single fops for the device, each of our
> context’s should have an fops that we pass here?
> 
> >> +	list_add(&lun_access->list, &ctxi->luns);
> >> +	mutex_unlock(&ctxi->mutex);
> > 
> > Where is the matching lock for this?
> 
> Just like with get_context(), we leave create_context() with the mutex held.
> 
> >> +	rc = cxlflash_lun_attach(gli, MODE_PHYSICAL);
> >> +	if (unlikely(rc)) {
> >> +		dev_err(dev, "%s: Failed to attach to LUN! (PHYSICAL)\n",
> > 
> > Is this going to spam the console from userspace?  Same below.
> 
> With a well-behaved application it shouldn’t We can certainly look at moving
> to a debug if you feel it’s likely that someone would use this to spam the
> console.
> 
> >> +	struct cxlflash_cfg *cfg = (struct cxlflash_cfg *)sdev->host->hostdata;
> >> +	struct afu *afu = cfg->afu;
> >> +	struct dk_cxlflash_hdr *hdr;
> >> +	char buf[MAX_CXLFLASH_IOCTL_SZ];
> > 
> > why is buf not just a “union cxlflash_ioctls"?
> 
> Because I wanted you to have to look up this define? ;)

:-P

> In all seriousness, we originally had this define as a stand-alone value (it
> wasn’t tied to the union). When I implemented the union, I figured the define
> was more self-descriptive in what we were trying to convey. I’ll change it
> over to the union in v5.
> 
> >> +	hdr = (struct dk_cxlflash_hdr *)&buf;
> >> +	if (hdr->version != 0) {
> >> +		pr_err("%s: Version %u not supported for %s\n",
> >> +		       __func__, hdr->version, decode_ioctl(cmd));
> >> +		rc = -EINVAL;
> >> +		goto cxlflash_ioctl_exit;
> >> +	}
> > 
> > Do you advertise this version anywhere?  Users just have to call it and fail?
> 
> We don’t. We can add a version define to the exported ioctl header.

It needs to be exported dynamically by the kernel.  Not the headers.
Think new software on old kernel and visa versa.

> 
> > You should check hdr->flags are zero incase some new userspace tries to set
> > them.  Same for hdr->rsvd.
> 
> We can’t do that for the flags because those they are used for some of the ioctls.
> The reserved’s however _can_ be checked under the version 0 clause.

OK

> > Also, can you do these checks earlier.  It seems you've already done a bunch of
> > stuff before here.
> 
> From a runtime perspective, we haven’t actually done that much prior to the version
> check. I suppose we could put the check earlier but I don’t like the idea of touching
> data that hasn’t been copied in. So we would need to copy in just the header, then
> check the version. Then if it’s valid, copy in the rest. At some point later on if/when more
> versions are supported we might need to do something like this, but I think it seems a
> bit silly to do that now.

Ok.

> >> +#define DK_CXLFLASH_ATTACH		CXL_IOWR(0x80, dk_cxlflash_attach)
> >> +#define DK_CXLFLASH_USER_DIRECT		CXL_IOWR(0x81, dk_cxlflash_udirect)
> >> +#define DK_CXLFLASH_RELEASE		CXL_IOWR(0x84, dk_cxlflash_release)
> >> +#define DK_CXLFLASH_DETACH		CXL_IOWR(0x85, dk_cxlflash_detach)
> >> +#define DK_CXLFLASH_VERIFY		CXL_IOWR(0x86, dk_cxlflash_verify)
> >> +#define DK_CXLFLASH_RECOVER_AFU		CXL_IOWR(0x88, dk_cxlflash_recover_afu)
> >> +#define DK_CXLFLASH_MANAGE_LUN		CXL_IOWR(0x89, dk_cxlflash_manage_lun)
> > 
> > I'm not sure I'd leave these sparse.  What happens if the vlun patches don't
> > get in?
> 
> I do agree with you. I had wanted to keep them like this as their ordering matched
> closer with how they are expected to be used. That said, I’m okay with moving
> them around to avoid the sparseness.

YEah, plus it breaks your table in the ioctl

Mikey
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html