Re: radosgw only delivers whats cached if latency between keyrequest and actual download is above 90s

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I just tried this (with some smaller objects, maybe 4.5 MB, as well as
with a 16 GB file and it worked fine.

However, i am using apache + fastcgi interface to rgw, rather than civetweb.

-Ben

On Fri, Aug 21, 2015 at 12:19 PM, Sean <seapasulli@xxxxxxxxxxxx> wrote:
> We heavily use radosgw here for most of our work and we have seen a weird
> truncation issue with radosgw/s3 requests.
>
> We have noticed that if the time between the initial "ticket" to grab the
> object key and grabbing the data is greater than 90 seconds the object
> returned is truncated to whatever RGW has grabbed/cached after the initial
> connection and this seems to be around 512k.
>
> Here is some PoC. This will work on most objects I have tested mostly 1G to
> 5G keys in RGW::
>
> ------------------------------------------------------------------------------------------------------------------------
> ------------------------------------------------------------------------------------------------------------------------
> #!/usr/bin/env python
>
> import os
> import sys
> import json
> import time
>
> import boto
> import boto.s3.connection
>
> if __name__ == '__main__':
>     import argparse
>
>     parser = argparse.ArgumentParser(description='Delayed download.')
>
>     parser.add_argument('credentials', type=argparse.FileType('r'),
>         help='Credentials file.')
>
>     parser.add_argument('endpoint')
>     parser.add_argument('bucket')
>     parser.add_argument('key')
>
>     args = parser.parse_args()
>
>     credentials= json.load(args.credentials)[args.endpoint]
>
>     conn = boto.connect_s3(
>         aws_access_key_id     = credentials.get('access_key'),
>         aws_secret_access_key = credentials.get('secret_key'),
>         host                  = credentials.get('host'),
>         port                  = credentials.get('port'),
>         is_secure             = credentials.get('is_secure',False),
>         calling_format        = boto.s3.connection.OrdinaryCallingFormat(),
>     )
>
>     key = conn.get_bucket(args.bucket).get_key(args.key)
>
>     key.BufferSize = 1048576
>     key.open_read(headers={})
>     time.sleep(120)
>
>     key.get_contents_to_file(sys.stdout)
> ------------------------------------------------------------------------------------------------------------------------
> ------------------------------------------------------------------------------------------------------------------------
>
> The format of the credentials file is just standard::
>
> =============================================
> =============================================
> {
>  "cluster": {
>         "access_key": "blahblahblah",
>         "secret_key": "blahblahblah",
>         "host": "blahblahblah",
>         "port": "443",
>         "is_secure": true
>         }
> }
>
> =============================================
> =============================================
>
>
> From here your object will almost always be truncated to whatever the
> gateway has cached in the time after the initial key request.
>
> This can be a huge issue as if the radosgw or cluster is tasked some
> requests can be minutes long. You can end up grabbing the rest of the object
> by doing a range request against the gateway so I know the data is intact
> but I don't think the radosgw should be acting as if the download is
> completed successfully and I think it should instead return an error of some
> kind if it can no longer service the request.
>
> We are using hammer (ceph version 0.94.2
> (5fb85614ca8f354284c713a2f9c610860720bbf3)) and using civetweb as our
> gateway.
>
> This is on a 3 node test cluster but I have tried on our larger cluster with
> the same behavior. If I can provide any other information please let me
> know.
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux