Re: RGW - Can't download complete object

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The code is in wip-11620, abd it's currently on top of the next branch. We'll get it through the tests, then get it into hammer and firefly. I wouldn't recommend installing it in production without proper testing first.

Yehuda

----- Original Message -----
> From: "Sean Sullivan" <seapasulli@xxxxxxxxxxxx>
> To: "Yehuda Sadeh-Weinraub" <yehuda@xxxxxxxxxx>
> Cc: ceph-users@xxxxxxxxxxxxxx
> Sent: Wednesday, May 13, 2015 7:22:10 PM
> Subject: Re:  RGW - Can't download complete object
> 
> Thank you so much Yahuda! I look forward to testing these. Is there a way
> for me to pull this code in? Is it in master?
> 
> 
> On May 13, 2015 7:08:44 PM Yehuda Sadeh-Weinraub <yehuda@xxxxxxxxxx> wrote:
> 
> > Ok, I dug a bit more, and it seems to me that the problem is with the
> > manifest that was created. I was able to reproduce a similar issue (opened
> > ceph bug #11622), for which I also have a fix.
> >
> > I created new tests to cover this issue, and we'll get those recent fixes
> > as soon as we can, after we test for any regressions.
> >
> > Thanks,
> > Yehuda
> >
> > ----- Original Message -----
> > > From: "Yehuda Sadeh-Weinraub" <yehuda@xxxxxxxxxx>
> > > To: "Sean Sullivan" <seapasulli@xxxxxxxxxxxx>
> > > Cc: ceph-users@xxxxxxxxxxxxxx
> > > Sent: Wednesday, May 13, 2015 2:33:07 PM
> > > Subject: Re:  RGW - Can't download complete object
> > >
> > > That's another interesting issue. Note that for part 12_80 the manifest
> > > specifies (I assume, by the messenger log) this part:
> > >
> > > 
> > default.20283.1__shadow_b235040a-46b6-42b3-b134-962b1f8813d5/28357709e44fff211de63b1d2c437159.bam.tJ8UddmcCxe0lOsgfHR9Q-ZHXdlrM14.12_80
> > > (note the 'tJ8UddmcCxe0lOsgfHR9Q-ZHXdlrM14')
> > >
> > > whereas it seems that you do have the original part:
> > > 
> > default.20283.1__shadow_b235040a-46b6-42b3-b134-962b1f8813d5/28357709e44fff211de63b1d2c437159.bam.2/-ztodNISNLlaNeV4kDmrQwmkECBP2mZ.12_80
> > > (note the '2/...')
> > >
> > > The part that the manifest specifies does not exist, which makes me think
> > > that there is some weird upload sequence, something like:
> > >
> > >  - client uploads part, upload finishes but client does not get ack for
> > >  it
> > >  - client retries (second upload)
> > >  - client gets ack for the first upload and gives up on the second one
> > >
> > > But I'm not sure if it would explain the manifest, I'll need to take a
> > > look
> > > at the code. Could such a sequence happen with the client that you're
> > > using
> > > to upload?
> > >
> > > Yehuda
> > >
> > > ----- Original Message -----
> > > > From: "Sean Sullivan" <seapasulli@xxxxxxxxxxxx>
> > > > To: "Yehuda Sadeh-Weinraub" <yehuda@xxxxxxxxxx>
> > > > Cc: ceph-users@xxxxxxxxxxxxxx
> > > > Sent: Wednesday, May 13, 2015 2:07:22 PM
> > > > Subject: Re:  RGW - Can't download complete object
> > > >
> > > > Sorry for the delay. It took me a while to figure out how to do a range
> > > > request and append the data to a single file. The good news is that the
> > > > end
> > > > file seems to be 14G in size which matches the files manifest size. The
> > > > bad
> > > > news is that the file is completely corrupt and the radosgw log has
> > > > errors.
> > > > I am using the following code to perform the download::
> > > >
> > > > 
> > https://raw.githubusercontent.com/mumrah/s3-multipart/master/s3-mp-download.py
> > > >
> > > > Here is a clip of the log file::
> > > > --
> > > > 2015-05-11 15:28:52.313742 7f570db7d700  1 -- 10.64.64.126:0/1033338
> > > > <==
> > > > osd.11 10.64.64.101:6809/942707 5 ==== osd_op_reply(74566287
> > > > 
> > default.20283.1__shadow_b235040a-46b6-42b3-b134-962b1f8813d5/28357709e44fff211de63b1d2c437159.bam.2/-ztodNISNLlaNeV4kDmrQwmkECBP2mZ.13_12
> > > > [read 0~858004] v0'0 uv41308 ondisk = 0) v6 ==== 304+0+858004
> > > > (1180387808 0
> > > > 2445559038) 0x7f53d005b1a0 con 0x7f56f8119240
> > > > 2015-05-11 15:28:52.313797 7f57067fc700 20 get_obj_aio_completion_cb:
> > > > io
> > > > completion ofs=12934184960 len=858004
> > > > 2015-05-11 15:28:52.372453 7f570db7d700  1 -- 10.64.64.126:0/1033338
> > > > <==
> > > > osd.45 10.64.64.101:6845/944590 2 ==== osd_op_reply(74566142
> > > > 
> > default.20283.1__shadow_b235040a-46b6-42b3-b134-962b1f8813d5/28357709e44fff211de63b1d2c437159.bam.tJ8UddmcCxe0lOsgfHR9Q-ZHXdlrM14.12_80
> > > > [read 0~4194304] v0'0 uv0 ack = -2 ((2) No such file or directory)) v6
> > > > ====
> > > > 302+0+0 (3754425489 0 0) 0x7f53d005b1a0 con 0x7f56f81b1f30
> > > > 2015-05-11 15:28:52.372494 7f57067fc700 20 get_obj_aio_completion_cb:
> > > > io
> > > > completion ofs=12145655808 len=4194304
> > > >
> > > > 2015-05-11 15:28:52.372501 7f57067fc700  0 ERROR: got unexpected error
> > > > when
> > > > trying to read object: -2
> > > >
> > > > 2015-05-11 15:28:52.426079 7f570db7d700  1 -- 10.64.64.126:0/1033338
> > > > <==
> > > > osd.21 10.64.64.102:6856/1133473 16 ==== osd_op_reply(74566144
> > > > 
> > default.20283.1__shadow_b235040a-46b6-42b3-b134-962b1f8813d5/28357709e44fff211de63b1d2c437159.bam.2/-ztodNISNLlaNeV4kDmrQwmkECBP2mZ.11_12
> > > > [read 0~3671316] v0'0 uv41395 ondisk = 0) v6 ==== 304+0+3671316
> > > > (1695485150
> > > > 0 3933234139) 0x7f53d005b1a0 con 0x7f56f81e17d0
> > > > 2015-05-11 15:28:52.426123 7f57067fc700 20 get_obj_aio_completion_cb:
> > > > io
> > > > completion ofs=10786701312 len=3671316
> > > > 2015-05-11 15:28:52.504072 7f570db7d700  1 -- 10.64.64.126:0/1033338
> > > > <==
> > > > osd.82 10.64.64.103:6857/88524 2 ==== osd_op_reply(74566283
> > > > 
> > default.20283.1__shadow_b235040a-46b6-42b3-b134-962b1f8813d5/28357709e44fff211de63b1d2c437159.bam.2/-ztodNISNLlaNeV4kDmrQwmkECBP2mZ.13_8
> > > > [read 0~4194304] v0'0 uv41566 ondisk = 0) v6 ==== 303+0+4194304
> > > > (1474509283
> > > > 0 3209869954) 0x7f53d005b1a0 con 0x7f56f81b1420
> > > > 2015-05-11 15:28:52.504118 7f57067fc700 20 get_obj_aio_completion_cb:
> > > > io
> > > > completion ofs=12917407744 len=4194304
> > > >
> > > > I couldn't really find any good documentation on how fragments/files
> > > > are
> > > > layed out on the object file system so I am not sure on where the file
> > > > will
> > > > be. How could the 4mb object have issues but the cluster be completely
> > > > health okay? I did do the rados stat of each object inside ceph and
> > > > they
> > > > all
> > > > appear to be there::
> > > >
> > > > http://paste.ubuntu.com/11118561/
> > > >
> > > > The sum of all of the objects :: 14584887282
> > > > The stat of the object inside ceph:: 14577056082
> > > >
> > > > So for some reason I have more data in objects than the key manifest.
> > > > We
> > > > easiliy identified this object via the same method as the other thread
> > > > I
> > > > have::
> > > >
> > > > for key in keys:
> > > >    ....:     if ( key.name ==
> > > >    
> > 'b235040a-46b6-42b3-b134-962b1f8813d5/28357709e44fff211de63b1d2c437159.bam'
> > > >    ):
> > > >    ....:         implicit = key.size
> > > >    ....:         explicit =
> > > >    conn.get_bucket(bucket).get_key(key.name).size
> > > >    ....:         absolute = abs(implicit - explicit)
> > > >    ....:         print key.name
> > > >    ....:         print implicit
> > > >    ....:         print explicit
> > > >    ....:
> > > >
> > > > b235040a-46b6-42b3-b134-962b1f8813d5/28357709e44fff211de63b1d2c437159.bam
> > > > 14578628946
> > > > 14577056082
> > > >
> > > > So it looks like I have 3 different sizes. I figure this may be the
> > > > network
> > > > issue that was mentioned in the other thread but seeing as this is not
> > > > the
> > > > first 512k and the overalll size still matches as well as the errors I
> > > > am
> > > > seeing in the gateway I feel that this may be a bigger issue.
> > > >
> > > > Has anyone seen this before?  The only mention of the "got unexpected
> > > > error
> > > > when trying to read object" is here
> > > > (http://lists.ceph.com/pipermail/ceph-commit-ceph.com/2014-May/021688.html)
> > > > but my google skills are pretty poor.
> > > > _______________________________________________
> > > > ceph-users mailing list
> > > > ceph-users@xxxxxxxxxxxxxx
> > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > > >
> > > _______________________________________________
> > > ceph-users mailing list
> > > ceph-users@xxxxxxxxxxxxxx
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > >
> 
> 
> 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux