RE: [NFS-Ganesha-Devel] Re: lseek gets bad offset from nfs client with ganesha/gluster which supports SEEK

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Has something changed in the RFC, the knfsd code, or the client with respect to seeking a hole at the end of file, particularly a legacy file that has no actual holes?

Thanks

Frank

> -----Original Message-----
> From: Kinglong Mee [mailto:kinglongmee@xxxxxxxxx]
> Sent: Wednesday, September 12, 2018 5:04 PM
> To: Frank Filz <ffilzlnx@xxxxxxxxxxxxxx>; 'Anna Schumaker'
> <schumaker.anna@xxxxxxxxx>; 'Trond Myklebust'
> <trondmy@xxxxxxxxxxxxxxx>; devel@xxxxxxxxxxxxxxxxxxxxx; linux-
> nfs@xxxxxxxxxxxxxxx
> Subject: [NFS-Ganesha-Devel] Re: lseek gets bad offset from nfs client with
> ganesha/gluster which supports SEEK
> 
> On 2018/9/12 19:58, Frank Filz wrote:
> >> On 2018/9/12 7:20, Frank Filz wrote:
> >>>> On Tue, 2018-09-11 at 22:47 +0800, Kinglong Mee wrote:
> >>>>> On 2018/9/11 20:57, Trond Myklebust wrote:
> >>>>>> On Tue, 2018-09-11 at 20:29 +0800, Kinglong Mee wrote:
> >>>>>>> The latest ganesha/gluster supports seek according to,
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>> https://tools.ietf.org/html/draft-ietf-nfsv4-minorversion2-41#sec
> >>>>>> ti
> >>>>>> o
> >>>>>> n-15.11
> >>>>>>>
> >>>>>>>    From the given sa_offset, find the next data_content4 of type
> >>>>>>> sa_what
> >>>>>>>    in the file.  If the server can not find a corresponding sa_what,
> >>>>>>>    then the status will still be NFS4_OK, but sr_eof would be
> >>>>>>> TRUE.  If
> >>>>>>>    the server can find the sa_what, then the sr_offset is the
> >>>>>>> start of
> >>>>>>>    that content.  If the sa_offset is beyond the end of the
> >>>>>>> file, then
> >>>>>>>    SEEK MUST return NFS4ERR_NXIO.
> >>>>>>>
> >>>>>>> For a file's filemap as,
> >>>>>>>
> >>>>>>> Part    1: HOLE 0x0000000000000000 ---> 0x0000000000600000
> >>>>>>> Part    2: DATA 0x0000000000600000 ---> 0x0000000000700000
> >>>>>>> Part    3: HOLE 0x0000000000700000 ---> 0x0000000001000000>>
> >>>>>>> SEEK(0x700000, SEEK_DATA) gets result (sr_eof:1,
> >>>>>>> sr_offset:0x70000) from ganesha/gluster; SEEK(0x700000,
> >>>>>>> SEEK_HOLE) gets result (sr_eof:0, sr_offset:0x70000) from
> ganesha/gluster.
> >>>>>>>
> >>>>>>> If an application depends the lseek result for data searching,
> >>>>>>> it may enter infinite loop.
> >>>>>>>
> >>>>>>>         while (1) {
> >>>>>>>                 next_pos = lseek(fd, cur_pos, seek_type);
> >>>>>>>                 if (seek_type == SEEK_DATA) {
> >>>>>>>                         seek_type = SEEK_HOLE;
> >>>>>>>                 } else {
> >>>>>>>                         seek_type = SEEK_DATA;
> >>>>>>>                 }
> >>>>>>>
> >>>>>>>                 if (next_pos == -1) {
> >>>>>>>                         return ;
> >>>>>>>
> >>>>>>>                 cur_pos = next_pos;
> >>>>>>>         }
> >>>>>>>
> >>>>>>> The lseek syscall always gets 0x70000 from nfs client for those
> >>>>>>> two cases, but, if underlying filesystem is ext4/f2fs, or the
> >>>>>>> nfs server is knfsd, the lseek(0x700000, SEEK_DATA) gets ENXIO.
> >>>>>>>
> >>>>>>> I wanna to know,
> >>>>>>> should I fix the ganesha/gluster as knfsd return ENXIO for the
> >>>>>>> first case?
> >>>>>>> or should I fix the nfs client to return ENXIO for the first case?
> >>>>>>>
> >>>>>>
> >>>>>> It that test correct? The fallback implementation of SEEK_DATA
> >>>>>> assumes that the entire file is data, so lseek(SEEK_DATA) on any
> >>>>>> offset that is <= eof will be a no-op. The fallback
> >>>>>> implementation of SEEK_HOLE assumes that the first hole is at eof.
> >>>>>
> >>>>> I think that's non-nfsv4.2's logical.
> >>>>>
> >>>>>>
> >>>>>> IOW: unless the initial value for cur_pos is > eof, it looks to
> >>>>>> me as if the above test will loop infinitely given any filesystem
> >>>>>> that doesn't implement native support for SEEK_DATA/SEEK_HOLE.
> >>>>>>
> >>>>>
> >>>>> No, if underlying filesystem is ext4/f2fs, or the nfs server is
> >>>>> knfsd, the last lseek syscall always return ENXIO no matter the
> >>>>> cur_pos is > eof or not.
> >>>>>
> >>>>> A file ends with a hole as,
> >>>>> Part   22: DATA 0x0000000006a00000 ---> 0x0000000006afffff
> >>>>> Part   23: HOLE 0x0000000006b00000 ---> 0x000000000c7fffff
> >>>>>
> >>>>> # stat testfile
> >>>>>   File: testfile
> >>>>>   Size: 209715200       Blocks: 22640      IO Block: 4096   regular file
> >>>>> Device: 807h/2055d      Inode: 843122      Links: 2
> >>>>> Access: (0600/-rw-------)  Uid: (    0/    root)   Gid: (    0/    root)
> >>>>> Access: 2018-09-11 20:01:41.881227061 +0800
> >>>>> Modify: 2018-09-11 20:01:41.976478311 +0800
> >>>>> Change: 2018-09-11 20:01:41.976478311 +0800
> >>>>>  Birth: -
> >>>>>
> >>>>> # strace filemap testfile
> >>>>> ... ...
> >>>>> lseek(3, 111149056, SEEK_HOLE)          = 112197632
> >>>>> lseek(3, 112197632, SEEK_DATA)          = -1 ENXIO (No such device or
> >> address)
> >>>>>
> >>>>> Right now, when knfsd gets the ENXIO error, it returns the error
> >>>>> to client directly, and return to syscall.
> >>>>> But, ganesha set the sr_eof to true and return NFS4_OK to client
> >>>>> as RFC description, nfs client skips the sr_eof and return a bad
> >>>>> offset to syscall.
> >>>>
> >>>> Would it make more sense to change Knfsd instead of the client?  I
> >>>> think I was trying to keep things simple when I wrote the code, so
> >>>> I just passed the result of the lseek system call back to the client.
> >>>
> >>> Looking at the lseek(2) man page, it's not clear to me what should
> >>> be returned
> >> if as in this example, there is a HOLE at the end of the file (i.e.
> >> filesize is larger than the range of the last DATA in the file). It
> >> sounds like ext4 returns ENXIO if a SEEK_DATA is done past the last data in
> the file.
> >>>
> >>>        SEEK_DATA
> >>>               Adjust the file offset to the next location in the
> >>> file greater than or
> >> equal to offset containing data.  If offset points to data, then the
> >> file offset  is set
> >>>               to offset.
> >>>
> >>>        SEEK_HOLE
> >>>               Adjust  the file offset to the next hole in the file
> >>> greater than or equal
> >> to offset.  If offset points into the middle of a hole, then the file
> >> offset is set to
> >>>               offset.  If there is no hole past offset, then the
> >>> file offset is adjusted to
> >> the end of the file (i.e., there is an implicit hole at the end of any file).
> >>>
> >>>        In both of the above cases, lseek() fails if offset points
> >>> past the end of the
> >> file.
> >>>
> >>>        These operations allow applications to map holes in a
> >>> sparsely allocated
> >> file.  This can be useful for applications such as file backup tools,
> >> which can save space when
> >>>        creating backups and preserve holes, if they have a mechanism
> >>> for
> >> discovering holes.
> >>>
> >>>        For the purposes of these operations, a hole is a sequence of
> >>> zeros that
> >> (normally) has not been allocated in the underlying file storage.
> >> However, a filesystem is not
> >>>        obliged to report holes, so these operations are not a
> >>> guaranteed
> >> mechanism for mapping the storage space actually allocated to a file.
> >> (Furthermore,  a  sequence  of
> >>>        zeros  that actually has been written to the underlying
> >>> storage may not be
> >> reported as a hole.)  In the simplest implementation, a filesystem
> >> can support the operations
> >>>        by making SEEK_HOLE always return the offset of the end of
> >>> the file, and
> >> making SEEK_DATA always return offset (i.e., even if the location
> >> referred to by offset  is  a
> >>>        hole, it can be considered to consist of data that is a sequence of zeros).
> >>>
> >>> The RFC text is pretty clear:
> >>>
> >>>    SEEK is an operation that allows a client to determine the location
> >>>    of the next data_content4 in a file.  It allows an implementation of
> >>>    the emerging extension to lseek(2) to allow clients to determine the
> >>>    next hole whilst in data or the next data whilst in a hole.
> >>>
> >>>    From the given sa_offset, find the next data_content4 of type sa_what
> >>>    in the file.  If the server can not find a corresponding sa_what,
> >>>    then the status will still be NFS4_OK, but sr_eof would be TRUE.  If
> >>>    the server can find the sa_what, then the sr_offset is the start of
> >>>    that content.  If the sa_offset is beyond the end of the file, then
> >>>    SEEK MUST return NFS4ERR_NXIO.
> >>>
> >>>    All files MUST have a virtual hole at the end of the file.  I.e., if
> >>>    a filesystem does not support sparse files, then a compound with
> >>>    {SEEK 0 NFS4_CONTENT_HOLE;} would return a result of {SEEK 1 X;}
> >>>    where 'X' was the size of the file.
> >>>
> >>> Sa_offset is not past the end of the file, but there is no more
> >>> DATA, so a seek
> >> DATA from 0x70000 (original file) should return sr_eof TRUE.
> >>>
> >>> In either RFC or lseek(2), a seek HOLE for 0x70000 will return 0x70000.
> >>>
> >>> It certainly makes sense that you should be able to have a hole at
> >>> the end of a
> >> file (pre-allocated disk blocks but no data written yet), and is in
> >> fact what
> >> fallocate(2) will do.
> >>>
> >>> An NFS server could check the filesize and if sa_offset is <
> >>> filesize and a
> >> SEEK_DATA returns ENXIO, it could translate that to NFS4_OK and set
> >> sr_eof to TRUE.
> >>>
> >>> The Ganesha code in FSAL_GLUSTER I believe is wrong. It changes any
> >>> ENXIO
> >> result to NFS4_OK with sr_eof TRUE. It would be better for it to do
> >> the simple thing knfsd does of always passing along the ENXIO (this
> >> may be best if it is not possible to safely verify sa_offset really is < filesize).
> >>
> >> Do you mean modifying ganesha/gluster as knfsd does?
> >>
> >>         seek->seek_pos = vfs_llseek(file, seek->seek_offset, whence);
> >>         if (seek->seek_pos < 0)
> >>                 status = nfserrno(seek->seek_pos);
> >>         else if (seek->seek_pos >= i_size_read(file_inode(file)))
> >>                 seek->seek_eof = true;
> >>
> >> It is a working implementation, but not according to RFC description,
> >>
> >>    If the server can not find a corresponding sa_what,
> >>    then the status will still be NFS4_OK, but sr_eof would be TRUE.
> >>
> >> As in this example, there is a HOLE at the end of the file, SEEK(in
> >> hole,
> >> SEEK_DATA) should return NFS4_OK and sr_eof is TRUE, but knfsd return
> >> NFS4ERR_NXIO.
> >
> > FSAL_GLUSTER always translates lseek return of ENXIO to NFS4_Ok with
> sr_eod TRUE.
> >
> > It should at least ONLY do that if sa_offset is < filesize (which would then be
> correct per RFC).
> >
> > Knfsd, to my understanding, looks like it always just returns ENXIO (which isn't
> exactly per RFC, but at least doesn't confuse the client and application as badly).
> >
> 
> Copy that.
> I will push a fix as knfsd.
> 
> thanks,
> Kinglong Mee
> _______________________________________________
> Devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to
> devel-leave@xxxxxxxxxxxxxxxxxxxxx





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux