On 20-6-2017 04:35, Sage Weil wrote: > Try changing > > int32_t rval; > > in OSDOp in osd_types.h to errorcode32_t. Nice suggestion, and I think it is a correct one. But I'm still getting -125 as error code. --WjW > > sage > > > On Tue, 20 Jun 2017, Willem Jan Withagen wrote: > >> On 19-6-2017 17:45, Willem Jan Withagen wrote: >>> On 19-6-2017 16:55, Gregory Farnum wrote: >>>> On Mon, Jun 19, 2017 at 7:46 AM, Willem Jan Withagen <wjw@xxxxxxxxxxx> wrote: >>>>> Op 19-6-2017 om 16:31 schreef Sage Weil: >>>>>> >>>>>> On Mon, 19 Jun 2017, Willem Jan Withagen wrote: >>>>>>> >>>>>>> On 19-6-2017 14:56, Jason Dillaman wrote: >>>>>>>> >>>>>>>> On Sun, Jun 18, 2017 at 1:18 PM, Willem Jan Withagen <wjw@xxxxxxxxxxx> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> librbd/io/AioCompletion.cc:199:ssize_t >>>>>>>>> AioCompletion::get_return_value() { >>>>>>>> >>>>>>>> >>>>>>>> librbd just wraps librados, so I would think all the error codes >>>>>>>> should have already been properly translated before it reaches this >>>>>>>> level since otherwise any internal librbd error logging will output >>>>>>>> the incorrect failure reason. I'd suspect most of the client-side >>>>>>>> handling should probably be handled inside osdc/Objecter.h/cc.. >>>>>>> >>>>>>> Hi Jason, >>>>>>> >>>>>>> Thanx for the pointer. Changing any of the librbd stuff did indeed not >>>>>>> result in a working rados-stripper.sh >>>>>>> >>>>>>> Objecter.{h,cc} already had the forward error rewrite. I added the >>>>>>> reverse in the original patch. But obviously that is not enough (yet) >>>>>>> So I'll start digging a bit more in the librados files as you suggested. >>>>>> >>>>>> I think the place to do this is in MOSDOpReply.. that alone should be >>>>>> enough to do the translate as the value passes over the wire. >>>>> >>>>> >>>>> Hi Sage, >>>>> >>>>> Tehe interesting part of this is that ALL tests but one actually work. So >>>>> all tests that start >>>>> a cluster thru vstart actually do work. EXCEPT for rados-stiper.sh. >>>>> >>>>> Now this make me question what is different with the stiper code that causes >>>>> an ECANCEL >>>>> to not be translated back ot FreeBSD code. >>>> >>>> I'm not sure exactly how it's arranged, but libradosstriper is layered >>>> on top of librados and I don't think anybody's done any of the errno >>>> translation work for other platforms that you got pointed at. >>>> Depending on how it's done that may mean it's missing big chunks -- >>>> for instance, if libradosstriper embeds error codes that aren't >>>> touched by librados, it will need to do its own translation. >>> >>> Hi Greg, >>> >>> The error is on the path server -> client. >>> >>> How do I know: FreeBSD highest error number atm is 96. >>> ECANCELD is an expected return value in the stiper-code. >>> So server-side translation seems to be doing what it should. >>> Client-side code is: >>> >>> 1260 ./src/libradosstriper/RadosStriperImpl.cc >>> ==== >>> bl.append(oss.str()); >>> writeOp.setxattr(XATTR_SIZE, bl); >>> rc = m_ioCtx.operate(firstObjOid, &writeOp); >>> // return current size >>> *size = curSize; >>> // handle case where objectsize is already bigger than size >>> if (-ECANCELED == rc) >>> rc = 0; >>> if (rc) { >>> unlockObject(soid, *lockCookie); >>> lderr(cct()) << "RadosStriperImpl::openStripedObjectForWrite : " >>> << "could not set new size for " >>> << soid << " : rc = " << rc << dendl; >>> } >>> return rc; >>> ==== >>> >>> So I have ot drill down into m_ioCtx.operate. >>> But I'll first look at Sage's suggestion. >> >> Have not been able to find the right spot.... >> So uped the logging, and this is the first place where any reference to >> -125 is made: >> 116: 2017-06-20 01:24:21.556950 80fc18800 5 -- 127.0.0.1:0/1969737172 >>>> 127.0.0.1:6804/60048 conn(0x81065c000 :-1 s=STATE_OPEN >> _MESSAGE_READ_FOOTER_AND_DISPATCH pgs=2 cs=1 l=1). rx osd.1 seq 6 >> 0x810696e00 osd_op_reply(5 toyfile.0000000000000000 [cmpxattr >> (8) op 3 mode 2,setxattr (4)] v19'4 uv3 ondisk = -125 ((125) Unknown >> error: 125)) v8 >> 116: 2017-06-20 01:24:21.556985 80fc18800 1 -- 127.0.0.1:0/1969737172 >> <== osd.1 127.0.0.1:6804/60048 6 ==== osd_op_reply(5 toyf >> ile.0000000000000000 [cmpxattr (8) op 3 mode 2,setxattr (4)] v19'4 uv3 >> ondisk = -125 ((125) Unknown error: 125)) v8 ==== 210+0+0 >> (669224781 0 0) 0x810696e00 con 0x81065c000 >> 116: 2017-06-20 01:24:21.557009 80fc18800 10 client.4115.objecter >> ms_dispatch 0x80fc33000 osd_op_reply(5 toyfile.000000000000000 >> 0 [cmpxattr (8) op 3 mode 2,setxattr (4)] v19'4 uv3 ondisk = -125 ((125) >> Unknown error: 125)) v8 >> 116: 2017-06-20 01:24:21.557024 80fc18800 10 client.4115.objecter in >> handle_osd_op_reply >> 116: 2017-06-20 01:24:21.557031 80fc18800 7 client.4115.objecter >> handle_osd_op_reply 5 ondisk uv 3 in 1.3 attempt 0 >> 116: 2017-06-20 01:24:21.557038 80fc18800 10 client.4115.objecter op 0 >> rval -85 len 0 >> 116: 2017-06-20 01:24:21.557043 80fc18800 10 client.4115.objecter op 1 >> rval 0 len 0 >> 116: 2017-06-20 01:24:21.557047 80fc18800 15 client.4115.objecter >> handle_osd_op_reply completed tid 5 >> 116: 2017-06-20 01:24:21.557050 80fc18800 15 client.4115.objecter >> finish_op 5 >> 116: 2017-06-20 01:24:21.557056 80fc18800 20 client.4115.objecter >> put_session s=0x810695800 osd=1 4 >> 116: 2017-06-20 01:24:21.557060 80fc18800 15 client.4115.objecter >> _session_op_remove 1 5 >> 116: 2017-06-20 01:24:21.557073 80fc18800 5 client.4115.objecter 0 in >> flight >> 116: 2017-06-20 01:24:21.557085 80fc18800 20 client.4115.objecter >> put_session s=0x810695800 osd=1 3 >> >> This make me wonder and now the question is if this osd_reply contains >> the numeric error value or is it a formatted text error report of some >> event on the server and there is already a translation problem on the >> server, and not in the client. >> >> --WjW >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html