Re: Caught the first erroneous translated errorcode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 17-6-2017 22:59, Willem Jan Withagen wrote:
> On 17-6-2017 19:52, John Spray wrote:
>> On Sat, Jun 17, 2017 at 11:50 AM, Willem Jan Withagen <wjw@xxxxxxxxxxx> wrote:
>>> Hi,
>>>
>>> I think I've found the first fact where the errno translation (ceph ->
>>> hostos -> client-ceph ) goes wrong....
>>>
>>> Repeatedly I get the following error:
>>> 116:
>>> /home/jenkins/workspace/ceph-master/src/test/libradosstriper/rados-striper.sh:42:
>>> run:  rados --pool rbd --striper put toy
>>> file td/rados-striper/toyfile
>>> 116: 2017-06-17 12:32:05.290234 810016000 -1 libradosstriper:
>>> RadosStriperImpl::openStripedObjectForWrite : could not set new s
>>> ize for toyfile : rc = -125error putting rbd/toyfile: (125) Unknown
>>> error: 125
>>>
>>> 125 is ECANCELD on Linux
>>> but FreeBSD
>>> #define ECANCELED       85              /* Operation canceled */
>>>
>>> So probably the server returns ECANCELD in network format (125)
>>> but the client does not translate back...
>>
>> Somewhat related perhaps: people running cephfs on ARM recently had
>> this problem, for that case the solution was simply to define in Ceph
>> some constants that mirror the linux ones, see commit 88d2da5e9.
> 
> Hi John,
> 
> I think is at the other end I'm working at.
> This commit is about the flags being issued to file open.
> Where I'm sort of suprised since this is Linux <> Linux.
> So perhaps it is all about big-endian <> Little Endian.
> 
> My PR is more about the error code that differ between Linux and
> FreeBSD. So A FreeBSD client would not (correctly) understand the error
> codes that a Linux server issues.
> So I translate all wire error codes to Linux codes on a server
> (hostos_to_ceph_errno_conv()), and in a FreeBSD client I translate the
> wire-error codes into FreeBSD codes. (ceph_to_hostoserrno_conv())
> 
> Specific in this case:
> ECANCELED is 125 on Linux, and is the on wire code.
> So the servers started in the test will signal ECANCELED with value 125,
> but because the rados-stripe code does not translate that back into 85
> (FreeBSD ECANCELED) it is reported as a unkonwn error. Whereas in this
> part of the code ECANCELED is a valid return and is adequately handled.
> 
> So that is why I suspect that the rados code does not take this
> translation into account. Which is not supprising, since only the code
> from OS to wire was available, but the back path was not included until
> I introduced it. But it is hard to find all locations where it should be
> applied.
> 
> So I'm looking for (all) the correct place to insert:
> 	ceph_to_hostos_errno_conv(err)

I think I might have found some locations where the result on the wire
is fetched.... But not sure is this is at the correct level?

os/fs/aio.h:42:  int get_return_value() {
libradosstriper/MultiAioCompletionImpl.h:116:  int get_return_value() {
librados/AioCompletionImpl.h:115:  int get_return_value() {
librados/PoolAsyncCompletionImpl.h:59:    int get_return_value() {
librbd/io/AioCompletion.cc:199:ssize_t AioCompletion::get_return_value() {

--WjW


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux