WRITE IO SIZE TOTAL SIZE No ZC ZC 1 1MB 22.4 kb/s 19.8 kb/s 32 32MB 711 kb/s 633 kb/s 64 64MB 1.4 mb/s 1.3 mb/s 128 128MB 2.8 mb/s 2.6 mb/s 256 256MB 5.6 mb/s 5.1 mb/s 512 512MB 10.4 mb/s 10.2 mb/s 1024 1GB 19.7 mb/s 20.4 mb/s 2048 2GB 40.1 mb/s 43.7 mb/s 4096 4GB 71.4 mb/s 73.1 mb/s READ IO SIZE TOTAL SIZE No ZC ZC 1 1MB 26.6 kb/s 23.1 kb/s 32 32MB 783 kb/s 734 kb/s 64 64MB 1.7 mb/s 1.5 mb/s 128 128MB 3.4 mb/s 3.0 mb/s 256 256MB 4.2 mb/s 5.9 mb/s 512 512MB 6.9 mb/s 11.6 mb/s 1024 1GB 23.3 mb/s 23.4 mb/s 2048 2GB 42.5 mb/s 45.4 mb/s 4096 4GB 67.4 mb/s 73.9 mb/s As you can see, the difference is marginal..but zc improves as the IO size increases. In the past we have seen tremendous improvements with different msizes. It is mostly because of shipping bigger chunks of data which is possible with zero copy. Also it could be my setup/box even on the host I am getting same/similar numbers. - JV On 2/8/2011 1:16 PM, Eric Van Hensbergen wrote: > oh and, for reference, while a different environment, my request is > based on a little bit more than idle fancy. Check out the graph on > page 6 in: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.108.8182&rep=rep1&type=pdf > > -eric > > > On Tue, Feb 8, 2011 at 3:09 PM, Eric Van Hensbergen <ericvh@xxxxxxxxx> wrote: >> One thing I wonder is if we always want to zero copy for payload. In >> the extreme, do we want to take the overhead of pinning an extra page >> if we are only reading/writing a byte? memcpy is expensive for large >> packets, but may actually be more efficient for small packets. >> >> Have we done any performance measurements of this code with various >> payload sizes versus non-zero-copy? Of course that may not really >> show the impact of pinning the extra pages.... >> >> In any case, if such a tradeoff did exist, we might choose to not do >> zero copy for requests smaller than some size -- and that might >> alleviate some of the problems with the legacy protocols (such as the >> Rerror issue) -- killing two birds with one stone. In any case, given >> the implementations it should be really easy to shut me up with >> comparison data of zc and non-zc for 1, 64, 128, 256, 512, 1024, 2048, >> 4192 byte payloads (without caches enabled of course). >> >> -eric >> >> >> On Mon, Feb 7, 2011 at 12:57 AM, Venkateswararao Jujjuri (JV) >> <jvrao@xxxxxxxxxxxxxxxxxx> wrote: >>> On 2/6/2011 11:21 PM, Venkateswararao Jujjuri (JV) wrote: >>>> Signed-off-by: Venkateswararao Jujjuri <jvrao@xxxxxxxxxxxxxxxxxx> >>>> --- >>>> net/9p/client.c | 45 +++++++++++++++++++++++++++++++-------------- >>>> net/9p/protocol.c | 38 ++++++++++++++++++++++++++++++++++++++ >>>> 2 files changed, 69 insertions(+), 14 deletions(-) >>>> >>>> diff --git a/net/9p/client.c b/net/9p/client.c >>>> index a848bca..f939edf 100644 >>>> --- a/net/9p/client.c >>>> +++ b/net/9p/client.c >>>> @@ -1270,7 +1270,14 @@ p9_client_read(struct p9_fid *fid, char *data, char __user *udata, u64 offset, >>>> if (count < rsize) >>>> rsize = count; >>>> >>>> - req = p9_client_rpc(clnt, P9_TREAD, "dqd", fid->fid, offset, rsize); >>>> + if ((clnt->trans_mod->pref & P9_TRANS_PREF_PAYLOAD_MASK) == >>>> + P9_TRANS_PREF_PAYLOAD_SEP) { >>>> + req = p9_client_rpc(clnt, P9_TREAD, "dqE", fid->fid, offset, >>>> + rsize, data ? data : udata); >>>> + } else { >>>> + req = p9_client_rpc(clnt, P9_TREAD, "dqd", fid->fid, offset, >>>> + rsize); >>>> + } >>>> if (IS_ERR(req)) { >>>> err = PTR_ERR(req); >>>> goto error; >>>> @@ -1284,13 +1291,15 @@ p9_client_read(struct p9_fid *fid, char *data, char __user *udata, u64 offset, >>>> >>>> P9_DPRINTK(P9_DEBUG_9P, "<<< RREAD count %d\n", count); >>>> >>>> - if (data) { >>>> - memmove(data, dataptr, count); >>>> - } else { >>>> - err = copy_to_user(udata, dataptr, count); >>>> - if (err) { >>>> - err = -EFAULT; >>>> - goto free_and_error; >>>> + if (!req->tc->pbuf_size) { >>>> + if (data) { >>>> + memmove(data, dataptr, count); >>>> + } else { >>>> + err = copy_to_user(udata, dataptr, count); >>>> + if (err) { >>>> + err = -EFAULT; >>>> + goto free_and_error; >>>> + } >>>> } >>>> } >>>> p9_free_req(clnt, req); >>>> @@ -1323,12 +1332,20 @@ p9_client_write(struct p9_fid *fid, char *data, const char __user *udata, >>>> >>>> if (count < rsize) >>>> rsize = count; >>>> - if (data) >>>> - req = p9_client_rpc(clnt, P9_TWRITE, "dqD", fid->fid, offset, >>>> - rsize, data); >>>> - else >>>> - req = p9_client_rpc(clnt, P9_TWRITE, "dqU", fid->fid, offset, >>>> - rsize, udata); >>>> + >>>> + if ((clnt->trans_mod->pref & P9_TRANS_PREF_PAYLOAD_MASK) == >>>> + P9_TRANS_PREF_PAYLOAD_SEP) { >>>> + req = p9_client_rpc(clnt, P9_TWRITE, "dqF", fid->fid, offset, >>>> + rsize, data ? data : udata); >>>> + } else { >>>> + if (data) >>>> + req = p9_client_rpc(clnt, P9_TWRITE, "dqD", fid->fid, >>>> + offset, rsize, data); >>>> + else >>>> + req = p9_client_rpc(clnt, P9_TWRITE, "dqU", fid->fid, >>>> + offset, rsize, udata); >>>> + } >>>> + >>>> if (IS_ERR(req)) { >>>> err = PTR_ERR(req); >>>> goto error; >>>> diff --git a/net/9p/protocol.c b/net/9p/protocol.c >>>> index dfc358f..ea778dd 100644 >>>> --- a/net/9p/protocol.c >>>> +++ b/net/9p/protocol.c >>>> @@ -114,6 +114,24 @@ pdu_write_u(struct p9_fcall *pdu, const char __user *udata, size_t size) >>>> return size - len; >>>> } >>>> >>>> +static size_t >>>> +pdu_write_uw(struct p9_fcall *pdu, const char *udata, size_t size) >>>> +{ >>>> + size_t len = min(pdu->capacity - pdu->size, size); >>>> + pdu->pbuf = udata; >>>> + pdu->pbuf_size = len; >>>> + return size - len; >>>> +} >>>> + >>>> +static size_t >>>> +pdu_write_ur(struct p9_fcall *pdu, const char *udata, size_t size) >>>> +{ >>>> + size_t len = min(pdu->capacity - pdu->size, size); >>>> + pdu->pbuf = udata; >>>> + pdu->pbuf_size = len; >>>> + return size - len; >>>> +} >>>> + >>>> /* >>>> b - int8_t >>>> w - int16_t >>>> @@ -445,6 +463,26 @@ p9pdu_vwritef(struct p9_fcall *pdu, int proto_version, const char *fmt, >>>> errcode = -EFAULT; >>>> } >>>> break; >>>> + case 'E':{ >>>> + int32_t count = va_arg(ap, int32_t); >>>> + const char *udata = va_arg(ap, const void *); >>>> + errcode = p9pdu_writef(pdu, proto_version, "d", >>>> + count); >>>> + if (!errcode && pdu_write_ur(pdu, udata, >>>> + count)) >>>> + errcode = -EFAULT; >>>> + } >>>> + break; >>>> + case 'F':{ >>>> + int32_t count = va_arg(ap, int32_t); >>>> + const char *udata = va_arg(ap, const void *); >>>> + errcode = p9pdu_writef(pdu, proto_version, "d", >>>> + count); >>>> + if (!errcode && pdu_write_uw(pdu, udata, >>>> + count)) >>>> + errcode = -EFAULT; >>>> + } >>>> + break; >>>> case 'U':{ >>>> int32_t count = va_arg(ap, int32_t); >>>> const char __user *udata = >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> The modern datacenter depends on network connectivity to access resources >>> and provide services. The best practices for maximizing a physical server's >>> connectivity to a physical network are well understood - see how these >>> rules translate into the virtual world? >>> http://p.sf.net/sfu/oracle-sfdevnlfb >>> _______________________________________________ >>> V9fs-developer mailing list >>> V9fs-developer@xxxxxxxxxxxxxxxxxxxxx >>> https://lists.sourceforge.net/lists/listinfo/v9fs-developer >>> >> -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html