Re: Orangefs ABI documentation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I haven't edited up a list of how the debug output looked,
but most importantly: the WARN_ON is hit... it appears that
the client-core is sending over fsid:0:

-Mike

On Thu, Feb 18, 2016 at 3:08 PM, Mike Marshall <hubcap@xxxxxxxxxxxx> wrote:
> I haven't been trussing it... it reports EINVAL to stderr... I find
> the ops to look
> at in the debug output by looking for the -22...
>
> (373) open ./clients/client8/~dmtmp/PARADOX/STUDENTS.DB failed for
> handle 9981 (Invalid argument)
>
> I just got the whacky code <g> from Al's last message to compile, I'll
> have results from that soon...
>
> -Mike
>
> On Thu, Feb 18, 2016 at 2:49 PM, Martin Brandenburg <martin@xxxxxxxxxxxx> wrote:
>> On Thu, 18 Feb 2016, Mike Marshall wrote:
>>
>>> Still busted, exactly the same, I think. The doomed op gets a good
>>> return code from is_daemon_in_service in service_operation but
>>> gets EAGAIN from wait_for_matching_downcall... an edge case kind of
>>> problem.
>>>
>>> Here's the raw (well, slightly edited for readability) logs showing
>>> the doomed op and subsequent failed op that uses the bogus handle
>>> and fsid from the doomed op.
>>>
>>>
>>>
>>> Alloced OP (ffff880012898000: 10889 OP_CREATE)
>>> service_operation: orangefs_create op:ffff880012898000:
>>>
>>>
>>>
>>> wait_for_matching_downcall: operation purged (tag 10889, ffff880012898000, att 0
>>> service_operation: wait_for_matching_downcall returned -11 for ffff880012898000
>>> Interrupted: Removed op ffff880012898000 from htable_ops_in_progress
>>> tag 10889 (orangefs_create) -- operation to be retried (1 attempt)
>>> service_operation: orangefs_create op:ffff880012898000:
>>> service_operation:client core is NOT in service, ffff880012898000
>>>
>>>
>>>
>>> service_operation: wait_for_matching_downcall returned 0 for ffff880012898000
>>> service_operation orangefs_create returning: 0 for ffff880012898000
>>> orangefs_create: PPTOOLS1.PPA:
>>> handle:00000000-0000-0000-0000-000000000000: fsid:0:
>>> new_op:ffff880012898000: ret:0:
>>>
>>>
>>>
>>> Alloced OP (ffff880012888000: 10958 OP_GETATTR)
>>> service_operation: orangefs_inode_getattr op:ffff880012888000:
>>> service_operation: wait_for_matching_downcall returned 0 for ffff880012888000
>>> service_operation orangefs_inode_getattr returning: -22 for ffff880012888000
>>> Releasing OP (ffff880012888000: 10958
>>> orangefs_create: Failed to allocate inode for file :PPTOOLS1.PPA:
>>> Releasing OP (ffff880012898000: 10889
>>>
>>>
>>>
>>>
>>> What I'm testing with differs from what is at kernel.org#for-next by
>>>   - diffs from Al's most recent email
>>>   - 1 souped up gossip message
>>>   - changed 0 to OP_VFS_STATE_UNKNOWN one place in service_operation
>>>   - reinit_completion(&op->waitq) in orangefs_clean_up_interrupted_operation
>>>
>>>
>>>
>>
>> Mike,
>>
>> what error do you get from userspace (i.e. from dbench)?
>>
>> open("./clients/client0/~dmtmp/EXCEL/5D7C0000", O_RDWR|O_CREAT, 0600) = -1 ENODEV (No such device)
>>
>> An interesting note is that I can't reproduce at all
>> with only one dbench process. It seems there's not
>> enough load.
>>
>> I don't see how the kernel could return ENODEV at all.
>> This may be coming from our client-core.
>>
>> -- Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux