> -----Original Message----- > From: Tom Talpey [mailto:tom@xxxxxxxxxx] > Sent: Tuesday, April 30, 2013 16:05 > To: Yan Burman > Cc: J. Bruce Fields; Wendy Cheng; Atchley, Scott; Tom Tucker; linux- > rdma@xxxxxxxxxxxxxxx; linux-nfs@xxxxxxxxxxxxxxx; Or Gerlitz > Subject: Re: NFS over RDMA benchmark > > On 4/30/2013 1:09 AM, Yan Burman wrote: > > > > > >> -----Original Message----- > >> From: J. Bruce Fields [mailto:bfields@xxxxxxxxxxxx] > >> Sent: Sunday, April 28, 2013 17:43 > >> To: Yan Burman > >> Cc: Wendy Cheng; Atchley, Scott; Tom Tucker; > >> linux-rdma@xxxxxxxxxxxxxxx; linux-nfs@xxxxxxxxxxxxxxx; Or Gerlitz > >> Subject: Re: NFS over RDMA benchmark > >> > >> On Sun, Apr 28, 2013 at 06:28:16AM +0000, Yan Burman wrote: > >>>>>>>>>>> On Wed, Apr 17, 2013 at 7:36 AM, Yan Burman > >>>>>>>>>>> <yanb@xxxxxxxxxxxx> > >>>>>>>>>>>> I've been trying to do some benchmarks for NFS over RDMA > >>>>>>>>>>>> and I seem to > >>>>>>>>> only get about half of the bandwidth that the HW can give me. > >>>>>>>>>>>> My setup consists of 2 servers each with 16 cores, 32Gb of > >>>>>>>>>>>> memory, and > >>>>>>>>> Mellanox ConnectX3 QDR card over PCI-e gen3. > >>>>>>>>>>>> These servers are connected to a QDR IB switch. The backing > >>>>>>>>>>>> storage on > >>>>>>>>> the server is tmpfs mounted with noatime. > >>>>>>>>>>>> I am running kernel 3.5.7. > >>>>>>>>>>>> > >>>>>>>>>>>> When running ib_send_bw, I get 4.3-4.5 GB/sec for block > >>>>>>>>>>>> sizes 4- > >>>> 512K. > >>>>>>>>>>>> When I run fio over rdma mounted nfs, I get 260-2200MB/sec > >>>>>>>>>>>> for the > >>>>>>>>> same block sizes (4-512K). running over IPoIB-CM, I get > >>>>>>>>> 200- > >>>> 980MB/sec. > >> ... > >>>>>>>> I am trying to get maximum performance from a single server > >>>>>>>> - I used 2 > >>>>>>> processes in fio test - more than 2 did not show any performance > >> boost. > >>>>>>>> I tried running fio from 2 different PCs on 2 different files, > >>>>>>>> but the sum of > >>>>>>> the two is more or less the same as running from single client PC. > >>>>>>>> > > > > I finally got up to 4.1GB/sec bandwidth with RDMA (ipoib-CM bandwidth is > also way higher now). > > For some reason when I had intel IOMMU enabled, the performance > dropped significantly. > > I now get up to ~95K IOPS and 4.1GB/sec bandwidth. > > Excellent, but is that 95K IOPS a typo? At 4KB, that's less than 400MBps. > That is not a typo. I get 95K IOPS with randrw test with block size 4K. I get 4.1GBps with block size 256K randread test. > What is the client CPU percentage you see under this workload, and how > different are the NFS/RDMA and NFS/IPoIB overheads? NFS/RDMA has about more 20-30% CPU usage than NFS/IPoIB, but RDMA has almost twice the bandwidth of IPoIB. Overall, CPU usage gets up to about 20% for randread and 50% for randwrite. > > > Now I will take care of the issue that I am running only at 40Gbit/s instead > of 56Gbit/s, but that is another unrelated problem (I suspect I have a cable > issue). > > > > This is still strange, since ib_send_bw with intel iommu enabled did get up > to 4.5GB/sec, so why did intel iommu affect only nfs code? > > You'll need to do more profiling to track that down. I would suspect that > ib_send_bw is using some sort of direct hardware access, bypassing the > IOMMU management and possibly performing no dynamic memory > registration. > > The NFS/RDMA code goes via the standard kernel DMA API, and correctly > registers/deregisters memory on a per-i/o basis in order to provide storage > data integrity. Perhaps there are overheads in the IOMMU management > which can be addressed. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html