Hi John,
You are correct in that my expectations may be incongruent with what is possible with ceph(fs). I'm currently copying many small files(images) from a netapp to the cluster...~35k sized files to be exact and the number of objects/files copied thus far is fairly significant(below in bold):
[bababurko@cephmon01 ceph]$ sudo rados df
pool name KB objects clones degraded unfound rd rd KB wr wr KB
cephfs_data 3289284749 163993660 0 0 0 0 0 328097038 3369847354
cephfs_metadata 133364 524363 0 0 0 3600023 5264453980 95600004 1361554516
rbd 0 0 0 0 0 0 0 0 0
total used 9297615196 164518023
total avail 19990923044
total space 29288538240
Yes, that looks like ~164 million objects copied to the cluster. I would assume this will potentially be a burden to the MDS but I have yet to confirm with the ceph daemontool mds.<id>. I cannot seem to run it on the mds host as it doesn't seem to know about that command:
[bababurko@cephmds01]$ sudo ceph daemonperf mds.cephmds01
no valid command found; 10 closest matches:
osd lost <int[0-]> {--yes-i-really-mean-it}
osd create {<uuid>}
osd primary-temp <pgid> <id>
osd primary-affinity <osdname (id|osd.id)> <float[0.0-1.0]>
osd reweight <int[0-]> <float[0.0-1.0]>
osd pg-temp <pgid> {<id> [<id>...]}
osd in <ids> [<ids>...]
osd rm <ids> [<ids>...]
osd down <ids> [<ids>...]
osd out <ids> [<ids>...]
Error EINVAL: invalid command
This fails in a similar manner on all the hosts in the cluster. I'm very green w/ ceph and i'm probably missing something obvious. Is there something I need to install to get access to the 'ceph daemonperf' command in hammerhead?
thanks,
Bob
On Wed, Aug 5, 2015 at 2:43 AM, John Spray <jspray@xxxxxxxxxx> wrote:
On Tue, Aug 4, 2015 at 10:36 PM, Bob Ababurko <bob@xxxxxxxxxxxx> wrote:
> My writes are not going as I would expect wrt to IOPS(50-1000 IOPs) & write
> throughput( ~25MB/s max). I'm interested in understanding what it takes to
> create a SSD pool that I can then migrate the current Cephfs_metadata pool
> to. I suspect that the spinning disk metadata pool is a bottleneck and I
> want to try to get the max performance out of this cluster to prove that we
> would build out a larger version. One caveat is that I have copied about 4
> TB of data to the cluster via cephfs and dont want to lose the data so I
> obviously need to keep the metadata intact.
I'm a bit suspicious of this: your IOPS expectations sort of imply
doing big files, but you're then suggesting that metadata is the
bottleneck (i.e. small file workload).
There are lots of statistics that come out of the MDS, you may be
particular interested in mds_server.handle_client_request,
objecter.op_active, to work out if there really are lots of RADOS
operations getting backed up on the MDS (which would be the symptom of
a too-slow metadata pool). "ceph daemonperf mds.<id>" may be some
help if you don't already have graphite or similar set up.
> If anyone has done this OR understands how this can be done, I would
> appreciate the advice.
You could potentially do this in a two-phase process where you
initially set a crush rule that includes both SSDs and spinners, and
then finally set a crush rule that just points to SSDs. Obviously
that'll do lots of data movement, but your metadata is probably a fair
bit smaller than your data so that might be acceptable.
John
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com