On Fri, Apr 12, 2013 at 03:58:04PM -0400, Michael Brown wrote: > KERBOOM > > [michael@fleming1 ~]$ sudo mount -a -t nfs > [sudo] password for michael: > mount: fearless1:/gv0 failed, reason given by server: No such file or > directory > mount: fearless1:/gv0/fleming1/db0/ALTUS_config failed, reason given by > server: unknown nfs status return value: 22 > mount: fearless1:/gv0/fleming1/db0/ALTUS_data failed, reason given by > server: unknown nfs status return value: 22 > mount: fearless1:/gv0/fleming1/db0/ALTUS_flash failed, reason given by > server: unknown nfs status return value: 22 > mount.nfs: mount point /db/flash_recovery_area/ALTUS/onlinelog does not > exist > > nfs.log: > [2013-04-12 15:55:16.507084] E [nfs3.c:305:__nfs3_get_volume_id] > (-->/usr/lib64/glusterfs/3.3.1/xlator/nfs/server.so(nfs3_fsinfo+0x22c) > [0x7f45bfbb852c] > (-->/usr/lib64/glusterfs/3.3.1/xlator/nfs/server.so(nfs3_fsinfo_reply+0x29) > [0x7f45bfbb2ce9] > (-->/usr/lib64/glusterfs/3.3.1/xlator/nfs/server.so(nfs3_request_xlator_deviceid+0x51) > [0x7f45bfbb2481]))) 0-nfs-nfsv3: invalid argument: xl > [2013-04-12 15:55:16.538560] E [nfs3.c:4706:nfs3_fsinfo] 0-nfs-nfsv3: > Bad Handle > [2013-04-12 15:55:16.538580] W [nfs3-helpers.c:3389:nfs3_log_common_res] > 0-nfs-nfsv3: XID: 242c1550, FSINFO: NFS: 10001(Illegal NFS file handle), > POSIX: 14(Bad address) > [2013-04-12 15:55:16.538617] E [nfs3.c:305:__nfs3_get_volume_id] > (-->/usr/lib64/glusterfs/3.3.1/xlator/nfs/server.so(nfs3_fsinfo+0x22c) > [0x7f45bfbb852c] > (-->/usr/lib64/glusterfs/3.3.1/xlator/nfs/server.so(nfs3_fsinfo_reply+0x29) > [0x7f45bfbb2ce9] > (-->/usr/lib64/glusterfs/3.3.1/xlator/nfs/server.so(nfs3_request_xlator_deviceid+0x51) > [0x7f45bfbb2481]))) 0-nfs-nfsv3: invalid argument: xl > > (I tried both with and without modifying your uint32_t size to a > 'int32_t size' to correct the signedness of the argument) > > Get ahold of me in IRC and let's get this figured out. I've got a > debugger attached. 23:51 < ndevos> Supermathie: ah, I've thought of the error in my suggestion - that function is used to encode and decode 23:52 < ndevos> which means, that the size parameter must be set correctly - the .data_len attribute contain the size when encoding, and should be overwritten when decoding 23:53 < ndevos> KERBOOM happens when an idea is only half looked at :-/ Maybe something the attached patch works better? It should encode/decode both the length and the fhandle value. Compile tested only. Niels > > M. > > On 13-04-12 11:32 AM, Niels de Vos wrote: > > On Fri, Apr 12, 2013 at 05:23:08PM +0200, Niels de Vos wrote: > >> On Thu, Apr 11, 2013 at 12:37:30PM -0400, Michael Brown wrote: > >>> That actually broke everything (including Linux trying to mount NFS). > >>> > >>> I've modified it slightly to be: > >>> > >>> bool_t > >>> xdr_nfs_fh3 (XDR *xdrs, nfs_fh3 *objp) > >>> { > >>> if (!xdr_bytes (xdrs, (char **)&objp->data.data_val, (u_int *) > >>> &objp->data.data_len, NFS3_FHSIZE)) > >>> if (!xdr_opaque (xdrs, &objp, (u_int *) > >>> &objp->data.data_len)) > >>> return FALSE; > >>> return TRUE; > >>> } > >>> > >>> (i.e. only call the xdr_opaque function if the xdr_bytes decode fails) > >> Nah, that won't work. The xdr_* functions are modifying the position of > >> the cursor in the XDR-stream. Subsequent reads will continue where the > >> previous one finished. > >> > >> What you probably need to do is something like this: > >> > >> xdr_nfs_fh3 (XDR *xdrs, nfs_fh3 *objp) > >> { > >> uint32_t size; > >> > >> if (!xdr_int (xdrs, &size)) > >> if (!xdr_opaque (xdrs, (u_int *)&objp->data.data_len, size)) > > ^ that should be objp->data.data_val of course :-/ > > > >> return FALSE > >> return TRUE; > >> } > >> > >> That will read the size of the fhandle first, to determine how long the opaque > >> fhandle is, and use that size to read it. > >> > >> Cheers, > >> Niels > >> > >>> But I get no change in behaviour. > >>> > >>> Also get these warnings: > >>> > >>> xdr-nfs3.c: In function 'xdr_nfs_fh3': > >>> xdr-nfs3.c:197: warning: passing argument 2 of 'xdr_opaque' from > >>> incompatible pointer type > >>> /usr/include/rpc/xdr.h:313: note: expected 'caddr_t' but argument is of > >>> type 'struct nfs_fh3 **' > >>> xdr-nfs3.c:197: warning: passing argument 3 of 'xdr_opaque' makes > >>> integer from pointer without a cast > >>> /usr/include/rpc/xdr.h:313: note: expected 'u_int' but argument is of > >>> type 'u_int *' > >>> > >>> M. > >>> > >>> On 13-04-11 07:42 AM, Niels de Vos wrote: > >>>> My guess is that this (untested) change would fix it, can you try that? > >>>> > >>>> --- a/rpc/xdr/src/xdr-nfs3.c > >>>> +++ b/rpc/xdr/src/xdr-nfs3.c > >>>> @@ -184,7 +184,7 @@ xdr_specdata3 (XDR *xdrs, specdata3 *objp) > >>>> bool_t > >>>> xdr_nfs_fh3 (XDR *xdrs, nfs_fh3 *objp) > >>>> { > >>>> - if (!xdr_bytes (xdrs, (char **)&objp->data.data_val, (u_int *) &objp->data.data_len, NFS3_FHSIZE)) > >>>> + if (!xdr_opaque (xdrs, &objp, (u_int *) &objp->data.data_len)) > >>>> return FALSE; > >>>> return TRUE; > >>>> } > >>>> > >>>> > >>>> HTH, > >>>> Niels > >>>> > >>>>> All I get out of gluster is: > >>>>> [2013-04-08 12:54:32.206312] E [nfs3.c:4741:nfs3svc_fsinfo] 0-nfs-nfsv3: > >>>>> Error decoding arguments > >>>>> > >>>>> > >>>>> I've attached abridged packet captures and text explanations of the > >>>>> packets (thanks to wireshark). > >>>>> > >>>>> Can someone please look at this and determine if it's gluster's parsing > >>>>> of the RPC call to blame, or if it's Oracle? > >>>>> > >>>>> This is the same setup on which I reported the NFS race condition bug. > >>>>> It does have that patch applied. > >>>>> Details: > >>>>> http://lists.gnu.org/archive/html/gluster-devel/2013-04/msg00014.html > >>>>> > >>>>> Thanks, > >>>>> > >>>>> Michael > >>>>> > >>>>> -- > >>>>> Michael Brown | `One of the main causes of the fall of > >>>>> Systems Consultant | the Roman Empire was that, lacking zero, > >>>>> Net Direct Inc. | they had no way to indicate successful > >>>>> ?: +1 519 883 1172 x5106 | termination of their C programs.' - Firth > >>>>> > >>>> > >>>> > >>>> > >>>>> _______________________________________________ > >>>>> Gluster-devel mailing list > >>>>> Gluster-devel@xxxxxxxxxx > >>>>> https://lists.nongnu.org/mailman/listinfo/gluster-devel > >>> > >>> -- > >>> Michael Brown | `One of the main causes of the fall of > >>> Systems Consultant | the Roman Empire was that, lacking zero, > >>> Net Direct Inc. | they had no way to indicate successful > >>> ☎: +1 519 883 1172 x5106 | termination of their C programs.' - Firth > >>> > >> -- > >> Niels de Vos > >> Sr. Software Maintenance Engineer > >> Support Engineering Group > >> Red Hat Global Support Services > >> > >> _______________________________________________ > >> Gluster-devel mailing list > >> Gluster-devel@xxxxxxxxxx > >> https://lists.nongnu.org/mailman/listinfo/gluster-devel > > > -- > Michael Brown | `One of the main causes of the fall of > Systems Consultant | the Roman Empire was that, lacking zero, > Net Direct Inc. | they had no way to indicate successful > ☎: +1 519 883 1172 x5106 | termination of their C programs.' - Firth > -- Niels de Vos Sr. Software Maintenance Engineer Support Engineering Group Red Hat Global Support Services
>From 2f7f6b952ed89f5cf8181db351e1965d8400f493 Mon Sep 17 00:00:00 2001 From: Niels de Vos <ndevos@xxxxxxxxxx> Date: Sat, 13 Apr 2013 00:41:43 +0200 Subject: [PATCH] nfs: encode/decode fhandles as opaque and not as bytes At least one client (Oracle DNFS) does not pass an XDR roundup'd byte array a fhandle on FSINFO. XDR (http://tools.ietf.org/html/rfc4506, the encoding used for the RPC protocol) uses 'blocks' for alignment. A fhandle byte array that is 34-bytes long, needs to be (34 / 4 + 1)*4 = 36 bytes in size. The 'length' given in the structure tells the consumer to ignore the two tailing bytes. The NFSv3 specification (http://tools.ietf.org/html/rfc1813#page-21) defines the nfs_fh3 as a opaque (not bytes) structure. BUG: 950121 Change-Id: Id723a38ef0ec6e7f1d9f29683321ea32e00503c7 Reported-by: Michael Brown <michael@xxxxxxxxxxxxxxx> Signed-off-by: Niels de Vos <ndevos@xxxxxxxxxx> --- rpc/xdr/src/xdr-nfs3.c | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/rpc/xdr/src/xdr-nfs3.c b/rpc/xdr/src/xdr-nfs3.c index a497e9f..39dbf5c 100644 --- a/rpc/xdr/src/xdr-nfs3.c +++ b/rpc/xdr/src/xdr-nfs3.c @@ -184,7 +184,9 @@ xdr_specdata3 (XDR *xdrs, specdata3 *objp) bool_t xdr_nfs_fh3 (XDR *xdrs, nfs_fh3 *objp) { - if (!xdr_bytes (xdrs, (char **)&objp->data.data_val, (u_int *) &objp->data.data_len, NFS3_FHSIZE)) + if (!xdr_uint32 (xdrs, (u_int *) &objp->data.data_len)) + return FALSE; + if (!xdr_opaque (xdrs, (char *) &objp->data.data_val, (u_int) objp->data.data_len)) return FALSE; return TRUE; } -- 1.7.1