Re: Caught the first erroneous translated errorcode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Try changing

  int32_t rval;

in OSDOp in osd_types.h to errorcode32_t.

sage


On Tue, 20 Jun 2017, Willem Jan Withagen wrote:

> On 19-6-2017 17:45, Willem Jan Withagen wrote:
> > On 19-6-2017 16:55, Gregory Farnum wrote:
> >> On Mon, Jun 19, 2017 at 7:46 AM, Willem Jan Withagen <wjw@xxxxxxxxxxx> wrote:
> >>> Op 19-6-2017 om 16:31 schreef Sage Weil:
> >>>>
> >>>> On Mon, 19 Jun 2017, Willem Jan Withagen wrote:
> >>>>>
> >>>>> On 19-6-2017 14:56, Jason Dillaman wrote:
> >>>>>>
> >>>>>> On Sun, Jun 18, 2017 at 1:18 PM, Willem Jan Withagen <wjw@xxxxxxxxxxx>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> librbd/io/AioCompletion.cc:199:ssize_t
> >>>>>>> AioCompletion::get_return_value() {
> >>>>>>
> >>>>>>
> >>>>>> librbd just wraps librados, so I would think all the error codes
> >>>>>> should have already been properly translated before it reaches this
> >>>>>> level since otherwise any internal librbd error logging will output
> >>>>>> the incorrect failure reason. I'd suspect most of the client-side
> >>>>>> handling should probably be handled inside osdc/Objecter.h/cc..
> >>>>>
> >>>>> Hi Jason,
> >>>>>
> >>>>> Thanx for the pointer. Changing any of the librbd stuff did indeed not
> >>>>> result in a working rados-stripper.sh
> >>>>>
> >>>>> Objecter.{h,cc} already had the forward error rewrite. I added the
> >>>>> reverse in the original patch. But obviously that is not enough (yet)
> >>>>> So I'll start digging a bit more in the librados files as you suggested.
> >>>>
> >>>> I think the place to do this is in MOSDOpReply.. that alone should be
> >>>> enough to do the translate as the value passes over the wire.
> >>>
> >>>
> >>> Hi Sage,
> >>>
> >>> Tehe interesting part of this is that ALL tests but one actually work. So
> >>> all tests that start
> >>> a cluster thru vstart actually do work. EXCEPT for rados-stiper.sh.
> >>>
> >>> Now this make me question what is different with the stiper code that causes
> >>> an ECANCEL
> >>> to not be translated back ot FreeBSD code.
> >>
> >> I'm not sure exactly how it's arranged, but libradosstriper is layered
> >> on top of librados and I don't think anybody's done any of the errno
> >> translation work for other platforms that you got pointed at.
> >> Depending on how it's done that may mean it's missing big chunks --
> >> for instance, if libradosstriper embeds error codes that aren't
> >> touched by librados, it will need to do its own translation.
> > 
> > Hi Greg,
> > 
> > The error is on the path server -> client.
> > 
> > How do I know: FreeBSD highest error number atm is 96.
> > ECANCELD is an expected return value in the stiper-code.
> > So server-side  translation seems to be doing what it should.
> > Client-side code is:
> > 
> > 1260 ./src/libradosstriper/RadosStriperImpl.cc
> > ====
> >   bl.append(oss.str());
> >   writeOp.setxattr(XATTR_SIZE, bl);
> >   rc = m_ioCtx.operate(firstObjOid, &writeOp);
> >   // return current size
> >   *size = curSize;
> >   // handle case where objectsize is already bigger than size
> >   if (-ECANCELED == rc)
> >     rc = 0;
> >   if (rc) {
> >     unlockObject(soid, *lockCookie);
> >     lderr(cct()) << "RadosStriperImpl::openStripedObjectForWrite : "
> >                    << "could not set new size for "
> >                    << soid << " : rc = " << rc << dendl;
> >   }
> >   return rc;
> > ====
> > 
> > So I have ot drill down into m_ioCtx.operate.
> > But I'll first look at Sage's suggestion.
> 
> Have not been able to find the right spot....
> So uped the logging, and this is the first place where any reference to
> -125 is made:
> 116: 2017-06-20 01:24:21.556950 80fc18800  5 -- 127.0.0.1:0/1969737172
> >> 127.0.0.1:6804/60048 conn(0x81065c000 :-1 s=STATE_OPEN
> _MESSAGE_READ_FOOTER_AND_DISPATCH pgs=2 cs=1 l=1). rx osd.1 seq 6
> 0x810696e00 osd_op_reply(5 toyfile.0000000000000000 [cmpxattr
> (8) op 3 mode 2,setxattr (4)] v19'4 uv3 ondisk = -125 ((125) Unknown
> error: 125)) v8
> 116: 2017-06-20 01:24:21.556985 80fc18800  1 -- 127.0.0.1:0/1969737172
> <== osd.1 127.0.0.1:6804/60048 6 ==== osd_op_reply(5 toyf
> ile.0000000000000000 [cmpxattr (8) op 3 mode 2,setxattr (4)] v19'4 uv3
> ondisk = -125 ((125) Unknown error: 125)) v8 ==== 210+0+0
>  (669224781 0 0) 0x810696e00 con 0x81065c000
> 116: 2017-06-20 01:24:21.557009 80fc18800 10 client.4115.objecter
> ms_dispatch 0x80fc33000 osd_op_reply(5 toyfile.000000000000000
> 0 [cmpxattr (8) op 3 mode 2,setxattr (4)] v19'4 uv3 ondisk = -125 ((125)
> Unknown error: 125)) v8
> 116: 2017-06-20 01:24:21.557024 80fc18800 10 client.4115.objecter in
> handle_osd_op_reply
> 116: 2017-06-20 01:24:21.557031 80fc18800  7 client.4115.objecter
> handle_osd_op_reply 5 ondisk uv 3 in 1.3 attempt 0
> 116: 2017-06-20 01:24:21.557038 80fc18800 10 client.4115.objecter  op 0
> rval -85 len 0
> 116: 2017-06-20 01:24:21.557043 80fc18800 10 client.4115.objecter  op 1
> rval 0 len 0
> 116: 2017-06-20 01:24:21.557047 80fc18800 15 client.4115.objecter
> handle_osd_op_reply completed tid 5
> 116: 2017-06-20 01:24:21.557050 80fc18800 15 client.4115.objecter
> finish_op 5
> 116: 2017-06-20 01:24:21.557056 80fc18800 20 client.4115.objecter
> put_session s=0x810695800 osd=1 4
> 116: 2017-06-20 01:24:21.557060 80fc18800 15 client.4115.objecter
> _session_op_remove 1 5
> 116: 2017-06-20 01:24:21.557073 80fc18800  5 client.4115.objecter 0 in
> flight
> 116: 2017-06-20 01:24:21.557085 80fc18800 20 client.4115.objecter
> put_session s=0x810695800 osd=1 3
> 
> This make me wonder and now the question is if this osd_reply contains
> the numeric error value or is it a formatted text error report of some
> event on the server and there is already a translation problem on the
> server, and not in the client.
> 
> --WjW
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux