Re: reservation errors during fstests on pNFS block

Christoph Hellwig <hch@xxxxxxxxxxxxx> · Fri, 14 Jun 2024 11:26:17 -0700

On Fri, Jun 14, 2024 at 05:46:21PM +0000, Chuck Lever III wrote:
> > Reservation means another node has an active reservation on that LU.
> 
> There are only two accessors of the LUN: the NFS server and
> the NFS client running the test. That's why these errors are
> a little surprising to me.

You can create registrations from userspace, and some cluster managers
do that.  But none of that should happen for a default setup.

> > When pNFS layout access fails we fall back to normal access through the
> > MDS, so this is expected.
> 
> Expected, OK. From a usability standpoint, error messages like
> this would probably be alarming to administrators. I plan to
> convert the printk's and dprintk's in the NFSD layout code into
> trace points, but that doesn't help the messages emitted by the
> block and SCSI drivers. Ideally this should be less noisy.

Well, they really should be alarming because the admin configured
a block layout setup and it did not work as expected.  So it should
ring alarm bells.

> > Is generic/069 that first test that failed when doing a full xfstests
> > run?
> 
> Yes, it's a full run. generic/069 is the first test where there
> are remarkable system journal messages (ie, PR errors), though
> there are a few subsequent tests that are also whinging.

Interesting.  Normally only the server actually reserves the LU,
the clients just register.  And something went wrong here and only
for these tests.

> > Do you see LAYOUT* ops in /proc/self/mountstats for the previous
> > tests?
> 
> generic/013 is known to generate layout recalls, for example,
> so there is layout activity during the test run.

Ok.  The other thing would be to run blktrace on the client and
see that it shows I/O.  But all this sounds like the tests in
general work, but something is up with generic/069.

generic/069 just does O_APPEND writes, so I can't see what
would be so special about it.

> 
> I can go back and try reproducing with just generic/069 and
> tcpdump as a first step. Is there a way I can tell that the
> PR errors are not reporting a possible data corruption?

xfstests in general does data verifycation to check for data integrity,
so we should not rely on kernel messages.

I'm a bit busy right now, but I'll try to reproduce this locally next
week.