Re: handle_async_copy calling kzalloc under spinlock

"J. Bruce Fields" <bfields@xxxxxxxxxxxx> · Fri, 16 Nov 2018 14:30:16 -0500

On Fri, Nov 16, 2018 at 01:52:29PM -0500, Olga Kornievskaia wrote:
> On Fri, Nov 16, 2018 at 1:30 PM Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
> > Then how does the copy knows not to go wait for the callback? Copy
> > checks the pending_callback list to see if received a callback. If
> > not, it puts itself on the copy list and goes to sleep. The callback,
> > checks the copy list and if it finds a copy signals it, if not it puts
> > itself on the pending_callback list. a lock is held over checking one
> > list and putting yourself on the other.

OK, apologies, I don't really understand those data structures yet, but
something seems wrong to me.

Under what circumstances could we recieve a CB_OFFLOAD without having
started the corresponding copy already?

And shouldn't CB_OFFLOAD be returning bad_stateid in the case it doesn't
recognize the given stateid?  It looks like the allocation failure is
the *only* way we'll return an error on CB_OFFLOAD, and that seems
wrong.

> > > I also wonder if SERVERFAULT is really the best error for a memory
> > > allocation failure there.
> >
> > I guess EIO or ENOMEM might be better. But I don't think this error
> > gets returned anywhere to the main process.
> >
> 
> Wait. It is returning SERVERFAULT because it's the callback server replying
> back to the server's CB_RECALL call and I believe SERVERFAULT is the
> appropriate error here. NFS doesn't have ENOMEM error.

We could return DELAY if we think it might be worth the server trying
the CB_RECALL again.  (That's what nfsd usually returns on allocation
failures.  I don't know if that's really ideal.)

--b.