Re: Orangefs ABI documentation

Al Viro <viro@xxxxxxxxxxxxxxxxxx> · Wed, 17 Feb 2016 21:17:54 +0000

On Wed, Feb 17, 2016 at 08:11:20PM +0000, Al Viro wrote:
> With reinit_completion() added?
> 
> > Maybe this is relevant:
> > 
> > Alloced OP ffff880015698000 <- doomed op for orangefs_create MAILBOX2.CPT
> > service_operation: orangefs_create op ffff880015698000
> > ffff880015698000 got past is_daemon_in_service
> > 
> > ... lots of stuff ...
> > 
> > w_f_m_d returned -11 for ffff880015698000 <- first op to get EAGAIN
> > 
> > first client core is NOT in service
> > second op to get EAGAIN
> >           ...
> > last client core is NOT in service
> > 
> > ... lots of stuff ...
> > 
> > service_operation returns to orangef_create with handle 0 fsid 0 ret 0
> > for MAILBOX2.CPT
> > 
> > I'm guessing you want me to wait to do the switching of my branch
> > until we fix this (last?) thing, let me know...
> 
> What I'd like to check is the value of op->waitq.done at retry_servicing.
> If we get there with a non-zero value, we've a problem.  BTW, do you
> hit any of gossip_err() in orangefs_clean_up_interrupted_operation()?

Am I right assuming that you are seeing zero from service_operation()
for ORANGEFS_VFS_OP_CREATE with zeroed ->downcall.resp.create.refn?
AFAICS, that can only happen with wait_for_matching_downcall() returning
0, which mean op_state_serviced(op).  And prior to that we had
set_op_state_waiting(op), which means set_op_state_serviced(op) done between
those two...

So unless you have something really nasty going on (buggered op refcounting,
memory corruption, etc.), we had orangefs_devreq_write_iter() pick that
op and copy the entire op->downcall from userland.  With op->downcall.status
not set to -EFAULT, or we would've...  Oh, shit - it's fed through
orangefs_normalize_to_errno().  OTOH, it would've hit
gossip_err("orangefs: orangefs_normalize_to_errno: got error code which is not
from ORANGEFS.\n") and presumably you would've noticed that.  And it still
would've returned that -EFAULT intact, actually.  In any case, it's a bug -
we need
	ret = -(ORANGEFS_ERROR_BIT|9);
	goto Broken;
instead of those
	ret = -EFAULT;
	goto Broken;
and
	ret = -(ORANGEFS_ERROR_BIT|8);
	goto Broken;
instead of
	ret = -ENOMEM;
	goto Broken;
in orangefs_devreq_write_iter().

Anyway, it looks like we didn't go through Broken: in there, so copy_from_user()
must've been successful.

Hmm...  Could the daemon have really given that reply?  Relevant test would
be something like
	WARN_ON(op->upcall.type == ORANGEFS_VFS_OP_CREATE &&
	    !op->downcall.resp.create.refn.fsid);
right after
wakeup:
        /*
         * tell the vfs op waiting on a waitqueue
         * that this op is done
         */
        spin_lock(&op->lock);
        if (unlikely(op_state_given_up(op))) {
                spin_unlock(&op->lock);
                goto out;
        }
        set_op_state_serviced(op);

and plain WARN_ON(1) right after Broken:

If that op goes through either of those set_op_state_serviced() with that
zero refn.fsid, we'll see a WARN_ON triggered...
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html