I haven't been trussing it... it reports EINVAL to stderr... I find the ops to look at in the debug output by looking for the -22... (373) open ./clients/client8/~dmtmp/PARADOX/STUDENTS.DB failed for handle 9981 (Invalid argument) I just got the whacky code <g> from Al's last message to compile, I'll have results from that soon... -Mike On Thu, Feb 18, 2016 at 2:49 PM, Martin Brandenburg <martin@xxxxxxxxxxxx> wrote: > On Thu, 18 Feb 2016, Mike Marshall wrote: > >> Still busted, exactly the same, I think. The doomed op gets a good >> return code from is_daemon_in_service in service_operation but >> gets EAGAIN from wait_for_matching_downcall... an edge case kind of >> problem. >> >> Here's the raw (well, slightly edited for readability) logs showing >> the doomed op and subsequent failed op that uses the bogus handle >> and fsid from the doomed op. >> >> >> >> Alloced OP (ffff880012898000: 10889 OP_CREATE) >> service_operation: orangefs_create op:ffff880012898000: >> >> >> >> wait_for_matching_downcall: operation purged (tag 10889, ffff880012898000, att 0 >> service_operation: wait_for_matching_downcall returned -11 for ffff880012898000 >> Interrupted: Removed op ffff880012898000 from htable_ops_in_progress >> tag 10889 (orangefs_create) -- operation to be retried (1 attempt) >> service_operation: orangefs_create op:ffff880012898000: >> service_operation:client core is NOT in service, ffff880012898000 >> >> >> >> service_operation: wait_for_matching_downcall returned 0 for ffff880012898000 >> service_operation orangefs_create returning: 0 for ffff880012898000 >> orangefs_create: PPTOOLS1.PPA: >> handle:00000000-0000-0000-0000-000000000000: fsid:0: >> new_op:ffff880012898000: ret:0: >> >> >> >> Alloced OP (ffff880012888000: 10958 OP_GETATTR) >> service_operation: orangefs_inode_getattr op:ffff880012888000: >> service_operation: wait_for_matching_downcall returned 0 for ffff880012888000 >> service_operation orangefs_inode_getattr returning: -22 for ffff880012888000 >> Releasing OP (ffff880012888000: 10958 >> orangefs_create: Failed to allocate inode for file :PPTOOLS1.PPA: >> Releasing OP (ffff880012898000: 10889 >> >> >> >> >> What I'm testing with differs from what is at kernel.org#for-next by >> - diffs from Al's most recent email >> - 1 souped up gossip message >> - changed 0 to OP_VFS_STATE_UNKNOWN one place in service_operation >> - reinit_completion(&op->waitq) in orangefs_clean_up_interrupted_operation >> >> >> > > Mike, > > what error do you get from userspace (i.e. from dbench)? > > open("./clients/client0/~dmtmp/EXCEL/5D7C0000", O_RDWR|O_CREAT, 0600) = -1 ENODEV (No such device) > > An interesting note is that I can't reproduce at all > with only one dbench process. It seems there's not > enough load. > > I don't see how the kernel could return ENODEV at all. > This may be coming from our client-core. > > -- Martin -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html