On 17 Nov 2023, at 6:21, Benjamin Coddington wrote: > On 16 Nov 2023, at 17:11, Anna Schumaker wrote: > >> On Thu, Nov 16, 2023 at 5:03 PM Benjamin Coddington <bcodding@xxxxxxxxxx> wrote: >>> >>> On 16 Nov 2023, at 16:44, Anna Schumaker wrote: >>> >>>> Hi Ben, >>>> >>>> On Wed, Nov 15, 2023 at 4:34 PM Benjamin Coddington <bcodding@xxxxxxxxxx> wrote: >>>>> >>>>> Now that we're calculating how large a remaining IO should be based >>>>> on the current request's offset, we no longer need to track bytes_left on >>>>> each struct nfs_direct_req. Drop the field, and clean up the direct >>>>> request tracepoints. >>>> >>>> I've been having problems with xfstests generic/465 on all NFS >>>> versions after applying this patch. Looking at wireshark, the client >>>> appears to be resending the same reads over and over again. Have you >>>> seen anything like this in your testing? >>> >>> I have generic/465 failing before and after these two patches on pNFS SCSI.. >>> but at least it completes. If I run it without pNFS I can see the same >>> thing.. it just sends the same reads over and over. I'll figure out why. >> >> Thanks! I have it failing normally as well, so that's expected. It's >> the hanging forever that's not :) > > The direct read is returning 0 when there's data on the device. > > Oh, the problem is probably that patch drops the update of dreq->max_count, > which I overlooked because of the double assignment. Shame on me. BTW - I think generic/465 makes bad assumptions about what read() should return for O_DIRECT, and that's why it fails on NFS. Basically it does a bunch of WRITEs and then READs and expects the same data coming back in the READs, but doesn't use O_SYNC. On the wire, the client is interleaving the READs and WRITEs. Ben