On Thu, 2009-03-05 at 11:49 -0800, Kevin W. Rudd wrote: > On Thu, 5 Mar 2009, Trond Myklebust wrote: > > > Going over that particular procedure reveals a number of bugs that all > > should be fixed. > > So true ;-) > > > Trying to estimate how many bytes are free in the header is _WRONG_. > > That memory is allocated with certain assumptions, which are set in the > > definition of ACL3_setaclargs_sz. In this case, the assumptions appear > > to be that we will fit a maximum of 5 ACEs in the header. > > IOW: the real fix for your corruption problem is simply to set > > len_in_head to 2*(2+5*3). Currently, the patch supplied by Sachin will > > always set it to zero. > > Does setting len_in_head to 2*(2+5*3) take into account the > header space already consumed? The initial patch was just a > quick-n-dirty way of making sure we didn't overwrite the initial > buffer, and a mechanism for generating some discussion around the > other deficiencies in this procedure. I'll take the blame for > it being a quick hack to simply take the heat off for a critical > customer situation :) > > > Secondly, doing page allocation in the XDR routine is _WRONG_. For one > > thing, look at the major memory leak which happens if the client needs > > to resend the request (which can happen every now and again due to a > > dropped tcp connection, or all the time due to UDP retransmits). > > Those pages should have been allocated in nfs3_proc_getacl(), and indeed > > that is where the wretched things are freed. > > That's an issue I hadn't noticed, but this isn't really my area of > expertice. Thanks for taking a closer look at this area of code. Does the attached patch help? Trond
Attachment:
linux-2.6.28-003-fix_nfsv3_acls.dif
Description: application/dif