On Thu, Feb 18, 2016 at 01:58:52PM -0500, Mike Marshall wrote: > wait_for_matching_downcall: operation purged (tag 10889, ffff880012898000, att 0 > service_operation: wait_for_matching_downcall returned -11 for ffff880012898000 > Interrupted: Removed op ffff880012898000 from htable_ops_in_progress state is "in progress" > tag 10889 (orangefs_create) -- operation to be retried (1 attempt) > service_operation: orangefs_create op:ffff880012898000: moved to "waiting" > service_operation:client core is NOT in service, ffff880012898000 > > > > service_operation: wait_for_matching_downcall returned 0 for ffff880012898000 > service_operation orangefs_create returning: 0 for ffff880012898000 ... and we've got to "serviced" somehow. IDGI... Are you sure that it's not a daemon replying with zero fsid? Could you slap gossip_debug(GOSSIP_WAIT_DEBUG, "%s: %s op:%p: process:%s state -> %d\n", __func__, op_name, op, current->comm, op->op_state); after assignments to ->op_state in set_op_state_purged() and set_op_state_serviced() as well as after the calls of set_op_state_waiting() (in service_operation() and orangefs_devreq_read()) and set_op_state_inprogress() (in orangefs_devreq_read()). Another thing: in orangefs_devreq_write_iter(), just before the set_op_state_serviced() add WARN_ON(op->upcall.type == ORANGEFS_OP_VFS_CREATE && !op->downcall.create.refn.fs_id); to make sure that this crap isn't coming from the daemon. While we are at it - #define op_is_cancel(op) ((op)->downcall.type == ORANGEFS_VFS_OP_CANCEL)is checking the wrong thing; should be #define op_is_cancel(op) ((op)->upcall.type == ORANGEFS_VFS_OP_CANCEL) Shouldn't be worse than a leak, though, so I doubt that it could be causing this problem... -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html