On Wed, Feb 17, 2016 at 02:24:34PM -0500, Mike Marshall wrote: > It is still busted, I've been trying to find clues as to why... With reinit_completion() added? > Maybe this is relevant: > > Alloced OP ffff880015698000 <- doomed op for orangefs_create MAILBOX2.CPT > service_operation: orangefs_create op ffff880015698000 > ffff880015698000 got past is_daemon_in_service > > ... lots of stuff ... > > w_f_m_d returned -11 for ffff880015698000 <- first op to get EAGAIN > > first client core is NOT in service > second op to get EAGAIN > ... > last client core is NOT in service > > ... lots of stuff ... > > service_operation returns to orangef_create with handle 0 fsid 0 ret 0 > for MAILBOX2.CPT > > I'm guessing you want me to wait to do the switching of my branch > until we fix this (last?) thing, let me know... What I'd like to check is the value of op->waitq.done at retry_servicing. If we get there with a non-zero value, we've a problem. BTW, do you hit any of gossip_err() in orangefs_clean_up_interrupted_operation()? -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html