$ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 1047.59473 root default
-2 261.89868 host ngfdv036
0 21.82489 osd.0 up 1.00000 1.00000
4 21.82489 osd.4 up 1.00000 1.00000
8 21.82489 osd.8 up 1.00000 1.00000
12 21.82489 osd.12 up 1.00000 1.00000
16 21.82489 osd.16 up 1.00000 1.00000
20 21.82489 osd.20 up 1.00000 1.00000
24 21.82489 osd.24 up 1.00000 1.00000
28 21.82489 osd.28 up 1.00000 1.00000
32 21.82489 osd.32 up 1.00000 1.00000
36 21.82489 osd.36 up 1.00000 1.00000
40 21.82489 osd.40 up 1.00000 1.00000
44 21.82489 osd.44 up 1.00000 1.00000
-3 261.89868 host ngfdv037
1 21.82489 osd.1 up 1.00000 1.00000
5 21.82489 osd.5 up 1.00000 1.00000
9 21.82489 osd.9 up 1.00000 1.00000
13 21.82489 osd.13 up 1.00000 1.00000
17 21.82489 osd.17 up 1.00000 1.00000
21 21.82489 osd.21 up 1.00000 1.00000
25 21.82489 osd.25 up 1.00000 1.00000
29 21.82489 osd.29 up 1.00000 1.00000
33 21.82489 osd.33 up 1.00000 1.00000
37 21.82489 osd.37 up 1.00000 1.00000
41 21.82489 osd.41 up 1.00000 1.00000
45 21.82489 osd.45 up 1.00000 1.00000
-4 261.89868 host ngfdv038
2 21.82489 osd.2 up 1.00000 1.00000
6 21.82489 osd.6 up 1.00000 1.00000
10 21.82489 osd.10 up 1.00000 1.00000
14 21.82489 osd.14 up 1.00000 1.00000
18 21.82489 osd.18 up 1.00000 1.00000
22 21.82489 osd.22 up 1.00000 1.00000
26 21.82489 osd.26 up 1.00000 1.00000
30 21.82489 osd.30 up 1.00000 1.00000
34 21.82489 osd.34 up 1.00000 1.00000
38 21.82489 osd.38 up 1.00000 1.00000
42 21.82489 osd.42 up 1.00000 1.00000
46 21.82489 osd.46 up 1.00000 1.00000
-5 261.89868 host ngfdv039
3 21.82489 osd.3 up 1.00000 1.00000
7 21.82489 osd.7 up 1.00000 1.00000
11 21.82489 osd.11 up 1.00000 1.00000
15 21.82489 osd.15 up 1.00000 1.00000
19 21.82489 osd.19 up 1.00000 1.00000
23 21.82489 osd.23 up 1.00000 1.00000
27 21.82489 osd.27 up 1.00000 1.00000
31 21.82489 osd.31 up 1.00000 1.00000
35 21.82489 osd.35 up 1.00000 1.00000
39 21.82489 osd.39 up 1.00000 1.00000
43 21.82489 osd.43 up 1.00000 1.00000
47 21.82489 osd.47 up 1.00000 1.00000
ceph -s
cluster 2b0e2d2b-3f63-4815-908a-b032c7f9427a
health HEALTH_OK
monmap e1: 2 mons at {ngfdv076=128.55.xxx.xx:6789/0,ngfdv078=128.55.xxx.xx:6789/0}
election epoch 4, quorum 0,1 ngfdv076,ngfdv078
osdmap e280: 48 osds: 48 up, 48 in
flags sortbitwise,require_jewel_osds
pgmap v117283: 3136 pgs, 11 pools, 25600 MB data, 510 objects
79218 MB used, 1047 TB / 1047 TB avail
3136 active+clean
Thank you Dan. I’ll try it.
Best,
Jialin
NERSC/LBNL
> On Jun 18, 2018, at 12:22 AM, Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
>
> Hi,
>
> One way you can see exactly what is happening when you write an object
> is with --debug_ms=1.
>
> For example, I write a 100MB object to a test pool: rados
> --debug_ms=1 -p test put 100M.dat 100M.dat
> I pasted the output of this here: https://pastebin.com/Zg8rjaTV
> In this case, it first gets the cluster maps from a mon, then writes
> the object to osd.58, which is the primary osd for PG 119.77:
>
> # ceph pg 119.77 query | jq .up
> [
> 58,
> 49,
> 31
> ]
>
> Otherwise I answered your questions below...
>
>> On Sun, Jun 17, 2018 at 8:29 PM Jialin Liu <jalnliu@xxxxxxx> wrote:
>>
>> Hello,
>>
>> I have a couple questions regarding the IO on OSD via librados.
>>
>>
>> 1. How to check which osd is receiving data?
>>
>
> See `ceph osd map`.
> For my example above:
>
> # ceph osd map test 100M.dat
> osdmap e236396 pool 'test' (119) object '100M.dat' -> pg 119.864b0b77
> (119.77) -> up ([58,49,31], p58) acting ([58,49,31], p58)
>
>> 2. Can the write operation return immediately to the application once the write to the primary OSD is done? or does it return only when the data is replicated twice? (size=3)
>
> Write returns once it is safe on *all* replicas or EC chunks.
>
>> 3. What is the I/O size in the lower level in librados, e.g., if I send a 100MB request with 1 thread, does librados send the data by a fixed transaction size?
>
> This depends on the client. The `rados` CLI example I showed you broke
> the 100MB object into 4MB parts.
> Most use-cases keep the objects around 4MB or 8MB.
>
>> 4. I have 4 OSS, 48 OSDs, will the 4 OSS become the bottleneck? from the ceph documentation, once the cluster map is received by the client, the client can talk to OSD directly, so the assumption is the max parallelism depends on the number of OSDs, is this correct?
>>
>
> That's more or less correct -- the IOPS and BW capacity of the cluster
> generally scales linearly with number of OSDs.
>
> Cheers,
> Dan
> CERN
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com