On 2/8/2011 1:09 PM, Eric Van Hensbergen wrote: > One thing I wonder is if we always want to zero copy for payload. In > the extreme, do we want to take the overhead of pinning an extra page > if we are only reading/writing a byte? memcpy is expensive for large > packets, but may actually be more efficient for small packets. I am not a memory expert, but I would assume memcpy also need to do same thing similar to get_user_pages() short of pinning pages. But I see the point. > > Have we done any performance measurements of this code with various > payload sizes versus non-zero-copy? Of course that may not really > show the impact of pinning the extra pages.... All our testing is with large buffers. Did not test with small buffers. > > In any case, if such a tradeoff did exist, we might choose to not do > zero copy for requests smaller than some size -- and that might > alleviate some of the problems with the legacy protocols (such as the > Rerror issue) -- killing two birds with one stone. In any case, given I think it is a wise decision to avoid zero copy if iosize+hdr_size <= pagesize. But It doesn't change any of today's complexity. Except may be saving an if condition. > the implementations it should be really easy to shut me up with > comparison data of zc and non-zc for 1, 64, 128, 256, 512, 1024, 2048, > 4192 byte payloads (without caches enabled of course). I think this is good experiment will publish data. > > -eric > > > On Mon, Feb 7, 2011 at 12:57 AM, Venkateswararao Jujjuri (JV) > <jvrao@xxxxxxxxxxxxxxxxxx> wrote: >> On 2/6/2011 11:21 PM, Venkateswararao Jujjuri (JV) wrote: >>> Signed-off-by: Venkateswararao Jujjuri <jvrao@xxxxxxxxxxxxxxxxxx> >>> --- >>> net/9p/client.c | 45 +++++++++++++++++++++++++++++++-------------- >>> net/9p/protocol.c | 38 ++++++++++++++++++++++++++++++++++++++ >>> 2 files changed, 69 insertions(+), 14 deletions(-) >>> >>> diff --git a/net/9p/client.c b/net/9p/client.c >>> index a848bca..f939edf 100644 >>> --- a/net/9p/client.c >>> +++ b/net/9p/client.c >>> @@ -1270,7 +1270,14 @@ p9_client_read(struct p9_fid *fid, char *data, char __user *udata, u64 offset, >>> if (count < rsize) >>> rsize = count; >>> >>> - req = p9_client_rpc(clnt, P9_TREAD, "dqd", fid->fid, offset, rsize); >>> + if ((clnt->trans_mod->pref & P9_TRANS_PREF_PAYLOAD_MASK) == >>> + P9_TRANS_PREF_PAYLOAD_SEP) { >>> + req = p9_client_rpc(clnt, P9_TREAD, "dqE", fid->fid, offset, >>> + rsize, data ? data : udata); >>> + } else { >>> + req = p9_client_rpc(clnt, P9_TREAD, "dqd", fid->fid, offset, >>> + rsize); >>> + } >>> if (IS_ERR(req)) { >>> err = PTR_ERR(req); >>> goto error; >>> @@ -1284,13 +1291,15 @@ p9_client_read(struct p9_fid *fid, char *data, char __user *udata, u64 offset, >>> >>> P9_DPRINTK(P9_DEBUG_9P, "<<< RREAD count %d\n", count); >>> >>> - if (data) { >>> - memmove(data, dataptr, count); >>> - } else { >>> - err = copy_to_user(udata, dataptr, count); >>> - if (err) { >>> - err = -EFAULT; >>> - goto free_and_error; >>> + if (!req->tc->pbuf_size) { >>> + if (data) { >>> + memmove(data, dataptr, count); >>> + } else { >>> + err = copy_to_user(udata, dataptr, count); >>> + if (err) { >>> + err = -EFAULT; >>> + goto free_and_error; >>> + } >>> } >>> } >>> p9_free_req(clnt, req); >>> @@ -1323,12 +1332,20 @@ p9_client_write(struct p9_fid *fid, char *data, const char __user *udata, >>> >>> if (count < rsize) >>> rsize = count; >>> - if (data) >>> - req = p9_client_rpc(clnt, P9_TWRITE, "dqD", fid->fid, offset, >>> - rsize, data); >>> - else >>> - req = p9_client_rpc(clnt, P9_TWRITE, "dqU", fid->fid, offset, >>> - rsize, udata); >>> + >>> + if ((clnt->trans_mod->pref & P9_TRANS_PREF_PAYLOAD_MASK) == >>> + P9_TRANS_PREF_PAYLOAD_SEP) { >>> + req = p9_client_rpc(clnt, P9_TWRITE, "dqF", fid->fid, offset, >>> + rsize, data ? data : udata); >>> + } else { >>> + if (data) >>> + req = p9_client_rpc(clnt, P9_TWRITE, "dqD", fid->fid, >>> + offset, rsize, data); >>> + else >>> + req = p9_client_rpc(clnt, P9_TWRITE, "dqU", fid->fid, >>> + offset, rsize, udata); >>> + } >>> + >>> if (IS_ERR(req)) { >>> err = PTR_ERR(req); >>> goto error; >>> diff --git a/net/9p/protocol.c b/net/9p/protocol.c >>> index dfc358f..ea778dd 100644 >>> --- a/net/9p/protocol.c >>> +++ b/net/9p/protocol.c >>> @@ -114,6 +114,24 @@ pdu_write_u(struct p9_fcall *pdu, const char __user *udata, size_t size) >>> return size - len; >>> } >>> >>> +static size_t >>> +pdu_write_uw(struct p9_fcall *pdu, const char *udata, size_t size) >>> +{ >>> + size_t len = min(pdu->capacity - pdu->size, size); >>> + pdu->pbuf = udata; >>> + pdu->pbuf_size = len; >>> + return size - len; >>> +} >>> + >>> +static size_t >>> +pdu_write_ur(struct p9_fcall *pdu, const char *udata, size_t size) >>> +{ >>> + size_t len = min(pdu->capacity - pdu->size, size); >>> + pdu->pbuf = udata; >>> + pdu->pbuf_size = len; >>> + return size - len; >>> +} >>> + >>> /* >>> b - int8_t >>> w - int16_t >>> @@ -445,6 +463,26 @@ p9pdu_vwritef(struct p9_fcall *pdu, int proto_version, const char *fmt, >>> errcode = -EFAULT; >>> } >>> break; >>> + case 'E':{ >>> + int32_t count = va_arg(ap, int32_t); >>> + const char *udata = va_arg(ap, const void *); >>> + errcode = p9pdu_writef(pdu, proto_version, "d", >>> + count); >>> + if (!errcode && pdu_write_ur(pdu, udata, >>> + count)) >>> + errcode = -EFAULT; >>> + } >>> + break; >>> + case 'F':{ >>> + int32_t count = va_arg(ap, int32_t); >>> + const char *udata = va_arg(ap, const void *); >>> + errcode = p9pdu_writef(pdu, proto_version, "d", >>> + count); >>> + if (!errcode && pdu_write_uw(pdu, udata, >>> + count)) >>> + errcode = -EFAULT; >>> + } >>> + break; >>> case 'U':{ >>> int32_t count = va_arg(ap, int32_t); >>> const char __user *udata = >> >> >> >> ------------------------------------------------------------------------------ >> The modern datacenter depends on network connectivity to access resources >> and provide services. The best practices for maximizing a physical server's >> connectivity to a physical network are well understood - see how these >> rules translate into the virtual world? >> http://p.sf.net/sfu/oracle-sfdevnlfb >> _______________________________________________ >> V9fs-developer mailing list >> V9fs-developer@xxxxxxxxxxxxxxxxxxxxx >> https://lists.sourceforge.net/lists/listinfo/v9fs-developer >> -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html