On Thu, 5 Mar 2009, Trond Myklebust wrote:
Going over that particular procedure reveals a number of bugs that all should be fixed.
So true ;-)
Trying to estimate how many bytes are free in the header is _WRONG_. That memory is allocated with certain assumptions, which are set in the definition of ACL3_setaclargs_sz. In this case, the assumptions appear to be that we will fit a maximum of 5 ACEs in the header. IOW: the real fix for your corruption problem is simply to set len_in_head to 2*(2+5*3). Currently, the patch supplied by Sachin will always set it to zero.
Does setting len_in_head to 2*(2+5*3) take into account the header space already consumed? The initial patch was just a quick-n-dirty way of making sure we didn't overwrite the initial buffer, and a mechanism for generating some discussion around the other deficiencies in this procedure. I'll take the blame for it being a quick hack to simply take the heat off for a critical customer situation :)
Secondly, doing page allocation in the XDR routine is _WRONG_. For one thing, look at the major memory leak which happens if the client needs to resend the request (which can happen every now and again due to a dropped tcp connection, or all the time due to UDP retransmits). Those pages should have been allocated in nfs3_proc_getacl(), and indeed that is where the wretched things are freed.
That's an issue I hadn't noticed, but this isn't really my area of expertice. Thanks for taking a closer look at this area of code. -Kevin -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html