Thanks both Matt and Eric,
That is really interesting. I do tend to use "mc" since it can handle
multiple keys readily (eg when a user reports a problem). It was
noticable that when getting the recursive listing of the "slow" bucket
using mc, the output did appear in a "chunked" manner, consistent with
the description of having to fetch the object list from all 32 shards,
sort/select and repeat.
I do remember seeing something about the new ability to list buckets
unsorted - we are running luminous; probably would have been 12.2.7 at
the time of writing. I assume it must need client-side support... not
yet there in current mc, or s3cmd 2.01.
However I'm pretty sure that the original user in question was using
s3cmd. Sorry if that seems like misdirection, it was just easier for me
to reproduce with mc. The strange disparity between listing the bucket
"root", and subdirectories, also takes place for non-recursive listings.
I can reproduce the slow response using s3cmd (using just a few samples
for timing data):
[~] % time ./s3cmd-fried ls s3://friedlab/ > DIR s3://friedlab/impute/
DIR s3://friedlab/wgs/
real varies from 2m44.137s - 7m53.282s
[~] % time ./s3cmd-fried ls s3://friedlab/impute/
DIR s3://friedlab/impute/illumina-affy/
DIR s3://friedlab/impute/illumina-wgs/
real varies from 0m0.558s - 0m3.884s
[~] % time ./s3cmd-fried ls s3://friedlab/wgs/
DIR s3://friedlab/wgs/amst/
(snip, but 42 dirs total)
DIR s3://friedlab/wgs/yrkt/
real varies from 0m2.506s - 0m21.250s
Also, of the 2 subtrees inside the bucket, they seem to contain 4281 and
20443 objects - which doesn't seem like a pathological number.
I'll need to look at "mc" more carefully - I'm getting more inconsistent
results than I remember from it right now.
Thanks,
Graham
On 11/05/2018 03:59 PM, Matt Benjamin wrote:
Hi,
I just did some testing to confirm, and can report, with "mc ls -r"
appears to be definitely inducing latency related to Unix path
emulation.
Matt
On Mon, Nov 5, 2018 at 3:10 PM, J. Eric Ivancich <ivancich@xxxxxxxxxx> wrote:
I did make an inquiry and someone here does have some experience w/ the
mc command -- minio client. We're curious how "ls -r" is implemented
under mc. Does it need to get a full listing and then do some path
parsing to produce nice output? If so, it may be playing a role in the
delay as well.
Eric
On 9/26/18 5:27 PM, Graham Allan wrote:
I have one user bucket, where inexplicably (to me), the bucket takes an
eternity to list, though only on the top level. There are two
subfolders, each of which lists individually at a completely normal
speed...
eg (using minio client):
[~] % time ./mc ls fried/friedlab/
[2018-09-26 16:15:48 CDT] 0B impute/
[2018-09-26 16:15:48 CDT] 0B wgs/
real 1m59.390s
[~] % time ./mc ls -r fried/friedlab/
...
real 3m18.013s
[~] % time ./mc ls -r fried/friedlab/impute
...
real 0m13.512s
[~] % time ./mc ls -r fried/friedlab/wgs
...
real 0m6.437s
The bucket has about 55k objects total, with 32 index shards on a
replicated ssd pool. It shouldn't be taking this long but I can't
imagine what could be causing this. I haven't found any others behaving
this way. I'd think it has to be some problem with the bucket index, but
what...?
I did naively try some "radosgw-admin bucket check [--fix]" commands
with no change.
Graham
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Graham Allan
Minnesota Supercomputing Institute - gta@xxxxxxx
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com