Re: Orangefs ABI documentation

Martin Brandenburg <martin@xxxxxxxxxxxx> · Wed, 17 Feb 2016 17:40:08 -0500 (EST)

On Wed, 17 Feb 2016, Mike Marshall wrote:

> It is still busted, I've been trying to find clues as to why...
> 
> Maybe this is relevant:
> 
> Alloced OP ffff880015698000 <- doomed op for orangefs_create MAILBOX2.CPT
> service_operation: orangefs_create op ffff880015698000
> ffff880015698000 got past is_daemon_in_service
> 
> ... lots of stuff ...
> 
> w_f_m_d returned -11 for ffff880015698000 <- first op to get EAGAIN
> 
> first client core is NOT in service
> second op to get EAGAIN
>           ...
> last client core is NOT in service
> 
> ... lots of stuff ...
> 
> service_operation returns to orangef_create with handle 0 fsid 0 ret 0
> for MAILBOX2.CPT
> 
> I'm guessing you want me to wait to do the switching of my branch
> until we fix this (last?) thing, let me know...
> 
> -Mike

I think I've identified something screwy.

Some process creates a file. Eventually we get into
wait_for_matching_downcall with no client-core. W_f_m_d
returns EAGAIN and op->lock is held. The op is still
waiting and in orangefs_request_list. Service_operation
calls orangefs_clean_up_interrupted_operation, which
attempts to remove the op from orangefs_request_list.

Meanwhile the client-core comes back and does a read.
W_f_m_d has returned EAGAIN, but the op is still in
orangefs_request_list, so it gets passed to the
client-core. Now the op is in service and in
htable_ops_in_progress.

But service_operation is about to retry it under the
impression that it was purged. So it puts the op back
in orangefs_request_list.

Then the client-core returns the op, so it is marked
serviced and returned to orangefs_inode_create.
Meanwhile something or other (great theory right? now
I'm less sure) happens with the second request (they
have the same tag) causing it to become corrupted.

I admit it starts to fall apart at the end, and I don't
have a clear theory on how this produces what we see.

In orangefs_clean_up_interrupted_operation

	if (op_state_waiting(op)) {
		/*
		 * upcall hasn't been read; remove op from upcall request
		 * list.
		 */
		spin_unlock(&op->lock);

		/* HERE */

		spin_lock(&orangefs_request_list_lock);
		list_del(&op->list);
		spin_unlock(&orangefs_request_list_lock);
		gossip_debug(GOSSIP_WAIT_DEBUG,
			     "Interrupted: Removed op %p from request_list\n",
			     op);
	} else if (op_state_in_progress(op)) {

and orangefs_devreq_read

restart:
	/* Get next op (if any) from top of list. */
	spin_lock(&orangefs_request_list_lock);
	list_for_each_entry_safe(op, temp, &orangefs_request_list, list) {
		__s32 fsid;
		/* This lock is held past the end of the loop when we break. */

		/* HERE */

		spin_lock(&op->lock);
		if (unlikely(op_state_purged(op))) {
			spin_unlock(&op->lock);
			continue;
		}

I think both processes can end up working on the same
op.

-- Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html