On Fri, Nov 16, 2018 at 01:52:29PM -0500, Olga Kornievskaia wrote: > On Fri, Nov 16, 2018 at 1:30 PM Olga Kornievskaia <aglo@xxxxxxxxx> wrote: > > Then how does the copy knows not to go wait for the callback? Copy > > checks the pending_callback list to see if received a callback. If > > not, it puts itself on the copy list and goes to sleep. The callback, > > checks the copy list and if it finds a copy signals it, if not it puts > > itself on the pending_callback list. a lock is held over checking one > > list and putting yourself on the other. OK, apologies, I don't really understand those data structures yet, but something seems wrong to me. Under what circumstances could we recieve a CB_OFFLOAD without having started the corresponding copy already? And shouldn't CB_OFFLOAD be returning bad_stateid in the case it doesn't recognize the given stateid? It looks like the allocation failure is the *only* way we'll return an error on CB_OFFLOAD, and that seems wrong. > > > I also wonder if SERVERFAULT is really the best error for a memory > > > allocation failure there. > > > > I guess EIO or ENOMEM might be better. But I don't think this error > > gets returned anywhere to the main process. > > > > Wait. It is returning SERVERFAULT because it's the callback server replying > back to the server's CB_RECALL call and I believe SERVERFAULT is the > appropriate error here. NFS doesn't have ENOMEM error. We could return DELAY if we think it might be worth the server trying the CB_RECALL again. (That's what nfsd usually returns on allocation failures. I don't know if that's really ideal.) --b.