On Thu, Jan 12, 2023 at 06:08:14AM -0800, Christoph Hellwig wrote: > On Thu, Jan 12, 2023 at 10:31:01AM +0000, David Howells wrote: > > > And use the information in the request for this one (see patch below), > > > and then move this patch first in the series, add an explicit direction > > > parameter in the gup_flags to the get/pin helper and drop iov_iter_rw > > > and the whole confusing source/dest information in the iov_iter entirely, > > > which is a really nice big tree wide cleanup that remove redundant > > > information. > > > > Fine by me, but Al might object as I think he wanted the internal checks. Al? > > I'm happy to have another discussion, but the fact the information in > the iov_iter is 98% redundant and various callers got it wrong and > away is a pretty good sign that we should drop this information. It > also nicely simplified the API. I have no problem with getting rid of iov_iter_rw(), but I would really like to keep ->data_source. If nothing else, any place getting direction wrong is a trouble waiting to happen - something that is currently dealing only with iovec and bvec might be given e.g. a pipe. Speaking of which, I would really like to get rid of the kludge /dev/sg is pulling - right now from-device requests there do the following: * copy the entire destination in (and better hope that nothing is mapped write-only, etc.) * form a request + bio, attach the pages with the destination copy to it * submit * copy the damn thing back to destination after the completion. The reason for that is (quoted in commit ecb554a846f8) ==== The semantics of SG_DXFER_TO_FROM_DEV were: - copy user space buffer to kernel (LLD) buffer - do SCSI command which is assumed to be of the DATA_IN (data from device) variety. This would overwrite some or all of the kernel buffer - copy kernel (LLD) buffer back to the user space. The idea was to detect short reads by filling the original user space buffer with some marker bytes ("0xec" it would seem in this report). The "resid" value is a better way of detecting short reads but that was only added this century and requires co-operation from the LLD. ==== IOW, we can't tell how much do we actually want to copy out, unless the SCSI driver in question is recent enough. Note that the above had been written in 2009, so it might not be an issue these days. Do we still have SCSI drivers that would not set the residual on bypass requests completion? Because I would obviously very much prefer to get rid of that copy in-overwrite-copy out thing there - given the accurate information about the transfer length it would be easy to do.