On Wed, Aug 23, 2017 at 02:36:33PM -0400, Chuck Lever wrote: > > > On Aug 22, 2017, at 5:45 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > > > > On Mon, Aug 21, 2017 at 06:08:15PM -0400, Chuck Lever wrote: > >> > >>> On Aug 21, 2017, at 5:21 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > >>> > >>> On Mon, Aug 21, 2017 at 05:15:38PM -0400, Chuck Lever wrote: > >>>> > >>>>> On Aug 21, 2017, at 5:13 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > >>>>> > >>>>> On Fri, Aug 18, 2017 at 11:12:19AM -0400, Chuck Lever wrote: > >>>>>> When processing an NFSv4 WRITE operation, argp->end should never > >>>>>> point past the end of the data in the final page of the page list. > >>>>>> Otherwise, nfsd4_decode_compound can walk into uninitialized memory. > >>>>>> > >>>>>> Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx> > >>>>>> --- > >>>>>> fs/nfsd/nfs4xdr.c | 6 ++---- > >>>>>> 1 file changed, 2 insertions(+), 4 deletions(-) > >>>>>> > >>>>>> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c > >>>>>> index 51e729a..7c48d68 100644 > >>>>>> --- a/fs/nfsd/nfs4xdr.c > >>>>>> +++ b/fs/nfsd/nfs4xdr.c > >>>>>> @@ -144,7 +144,7 @@ static void next_decode_page(struct nfsd4_compoundargs *argp) > >>>>>> argp->p = page_address(argp->pagelist[0]); > >>>>>> argp->pagelist++; > >>>>>> if (argp->pagelen < PAGE_SIZE) { > >>>>>> - argp->end = argp->p + (argp->pagelen>>2); > >>>>>> + argp->end = argp->p + XDR_QUADLEN(argp->pagelen); > >> > >> ^^^^^^^^^^^^^^ A > >> > >>>>>> argp->pagelen = 0; > >>>>>> } else { > >>>>>> argp->end = argp->p + (PAGE_SIZE>>2); > >>>>>> @@ -1279,9 +1279,7 @@ static __be32 nfsd4_decode_opaque(struct nfsd4_compoundargs *argp, struct xdr_ne > >>>>>> argp->pagelen -= pages * PAGE_SIZE; > >>>>>> len -= pages * PAGE_SIZE; > >>>>>> > >>>>>> - argp->p = (__be32 *)page_address(argp->pagelist[0]); > >>>>>> - argp->pagelist++; > >>>>>> - argp->end = argp->p + XDR_QUADLEN(PAGE_SIZE); > >> > >> ^^^^^^^^^^^^^^ B > >> > >>>>>> + next_decode_page(argp); > >>>>> > >>>>> I think there's no change in behavior here *except* for adding a new > >>>>> argp->pagelen=0 (or argp->pagelen -= PAGE_SIZE). > >>>> > >>>> The code around this change is currently working correctly, > >>>> so there is no change in behavior AFAICT. This is a defensive > >>>> change, but it also replaces duplicate code. > >>> > >>> I don't understand. I'm saying that by calling next_decode_page() there > >>> you've added a new argp->pagelen assignment. I don't understand how > >>> that can't change behavior, unless there's another bug in our bounds > >>> checking someplace. > >> > >> Because of line B above, argp->end always points to the > >> end of the final page in the page list. However, the > >> buffer might end somewhere in the middle of that page, > >> in which case, the transport hasn't initialized any of > >> the bytes between the end of the buffer and the end of > >> the page. > >> > >> As long as the other fields in the xdr_buf are set up > >> properly, the XDR decoder will not walk into that uninit- > >> ialized section of the last page. But there's nothing > >> preventing a decoder or transport bug from causing it > >> to walk into the uninitialized area. And always setting > >> to the end of the page is confusing when the buffer > >> itself is actually shorter. > >> > >> The key is to replace line B above with line A. argp->end > >> is advanced by the remaining part of the final page rather > >> than by a whole page. > > > > Got it, I agree with that part of the change, it's the pagelen change I > > was having trouble with. > > > > But looking at it more, I think your patch is a fix and the current code > > is wrong. > > > >> The next patch uses this new behavior to signal precisely > >> when it has to move from the page list to the tail iovec. > >> > >> > >>> Most likely it could cause subsequent op parsers to believe there's less > >>> space in the argument buffer than there really is, so it might fail to > >>> parse a compound with a write plus some other ops, if that puts the > >>> total call close to the maximum size? > >> > >> Where is argp->pagelen used after the final next_decode_page > >> call? > > > > Well, it's checked in every read_buf and next_decode_page to decide how > > much space is left. > > Right, and I didn't change the pagelen adjustment that occurs > in the loop. Just the final adjustment should be different. > > > > It looks to me like the current code is wrong not to be decreasing > > page_len at the end there. So I wonder if there's a bug right now. > > E.g. maybe a compound with multiple writes could leave the xdr decoding > > thinking it has more space than it does and allow someone to write > > unrelated memory to some file. > > I believe that's possible. Yeah, I can totally crash the server this way. I'll send this along for stable. (And see if I can figure out how to get a test for this kind of thing into pynfs. Unfortunately it's seems to require some contortions to get it to produce bad xdr.) --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html