Re: NFS over RDMA issues on Linux 5.4

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 03, 2020 at 12:24:21PM -0400, Chuck Lever wrote:
> Hi Timo-
>
> > On Aug 3, 2020, at 11:05 AM, Timo Rothenpieler <timo@xxxxxxxxxxxxxxxx> wrote:
> >
> > Hello,
> >
> > I have just deployed a new system with Mellanox ConnectX-4 VPI EDR IB cards and wanted to setup NFS over RDMA on it.
> >
> > However, while mounting the FS over RDMA works fine, actually using it results in the following messages absolutely hammering dmesg on both client and server:
> >
> >> https://gist.github.com/BtbN/9582e597b6581f552fa15982b0285b80#file-server-log
> >
> > The spam only stops once I forcibly reboot the client. The filesystem gets nowhere during all this. The retrans counter in nfsstat just keeps going up, nothing actually gets done.
> >
> > This is on Linux 5.4.54, using nfs-utils 2.4.3.
> > The mlx5 driver had enhanced-mode disabled in order to enable IPoIB connected mode with an MTU of 65520.
> >
> > Normal NFS 4.2 over tcp works perfectly fine on this setup, it's only when I mount via rdma that things go wrong.
> >
> > Is this an issue on my end, or did I run into a bug somewhere here?
> > Any pointers, patches and solutions to test are welcome.
>
> I haven't seen that failure mode here, so best I can recommend is
> keep investigating. I've copied linux-rdma in case they have any
> advice.

The mentioning of IPoIB is a slightly confusing in the context of NFS-over-RDMA.
Are you running NFS over IPoIB?

>From brief look on CQE error syndrome (local length error), the client sends wrong WQE.

Thanks

>
> --
> Chuck Lever
>
>
>



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux