Sorry, I meant kernel client or ceph-fuse? Client logs would be enough to start with, I suppose — "debug client = 20" and "debug ms = 1" if using ceph-fuse; if using the kernel client things get tricker; I'd have to look at what logging is available without the debugfs stuff being enabled. :/ -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Apr 23, 2013 at 3:00 PM, Bryan Stillwell <bstillwell@xxxxxxxxxxxxxxx> wrote: > I've tried a few different ones: > > 1. cp to cephfs mounted filesystem on Ubuntu 12.10 (quantal) > 2. rsync over ssh to cephfs mounted filesystem on Ubuntu 12.04.2 (precise) > 3. scp to cephfs mounted filesystem on Ubuntu 12.04.2 (precise) > > It's fairly reproducible, so I can collect logs for you. Which ones > would you be interested in? > > The cluster has been in a couple states during testing (during > expansion/rebalancing and during an all active+clean state). > > BTW, all the nodes are running with the 0.56.4-1precise packages. > > Bryan > > On Tue, Apr 23, 2013 at 12:56 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote: >> On Tue, Apr 23, 2013 at 11:38 AM, Bryan Stillwell >> <bstillwell@xxxxxxxxxxxxxxx> wrote: >>> I've run into an issue where after copying a file to my cephfs cluster >>> the md5sums no longer match. I believe I've tracked it down to some >>> parts of the file which are missing: >>> >>> $ obj_name=$(cephfs "title1.mkv" show_location -l 0 | grep object_name >>> | sed -e "s/.*:\W*\([0-9a-f]*\)\.[0-9a-f]*/\1/") >>> $ echo "Object name: $obj_name" >>> Object name: 10000001120 >>> >>> $ file_size=$(stat "title1.mkv" | grep Size | awk '{ print $2 }') >>> $ printf "File size: %d MiB (%d Bytes)\n" $(($file_size/1048576)) $file_size >>> File size: 20074 MiB (21049178117 Bytes) >>> >>> $ blocks=$((file_size/4194304+1)) >>> $ printf "Blocks: %d\n" $blocks >>> Blocks: 5019 >>> >>> $ for b in `seq 0 $(($blocks-1))`; do rados -p data stat >>> ${obj_name}.`printf '%8.8x\n' $b` | grep "error"; done >>> error stat-ing data/10000001120.00001076: No such file or directory >>> error stat-ing data/10000001120.000011c7: No such file or directory >>> error stat-ing data/10000001120.0000129c: No such file or directory >>> error stat-ing data/10000001120.000012f4: No such file or directory >>> error stat-ing data/10000001120.00001307: No such file or directory >>> >>> >>> Any ideas where to look to investigate what caused these blocks to not >>> be written? >> >> What client are you using to write this? Is it fairly reproducible (so >> you could collect logs of it happening)? >> >> Usually the only times I've seen anything like this were when either >> the file data was supposed to go into a pool which the client didn't >> have write permissions on, or when the RADOS cluster was in bad shape >> and so the data never got flushed to disk. Has your cluster been >> healthy since you started writing the file out? >> -Greg >> Software Engineer #42 @ http://inktank.com | http://ceph.com >> >> >>> >>> Here's the current state of the cluster: >>> >>> ceph -s >>> health HEALTH_OK >>> monmap e1: 1 mons at {a=172.24.88.50:6789/0}, election epoch 1, quorum 0 a >>> osdmap e22059: 24 osds: 24 up, 24 in >>> pgmap v1783615: 1920 pgs: 1917 active+clean, 3 >>> active+clean+scrubbing+deep; 4667 GB data, 9381 GB used, 4210 GB / >>> 13592 GB avail >>> mdsmap e437: 1/1/1 up {0=a=up:active} >>> >>> Here's my current crushmap: >>> >>> # begin crush map >>> >>> # devices >>> device 0 osd.0 >>> device 1 osd.1 >>> device 2 osd.2 >>> device 3 osd.3 >>> device 4 osd.4 >>> device 5 osd.5 >>> device 6 osd.6 >>> device 7 osd.7 >>> device 8 osd.8 >>> device 9 osd.9 >>> device 10 osd.10 >>> device 11 osd.11 >>> device 12 osd.12 >>> device 13 osd.13 >>> device 14 osd.14 >>> device 15 osd.15 >>> device 16 osd.16 >>> device 17 osd.17 >>> device 18 osd.18 >>> device 19 osd.19 >>> device 20 osd.20 >>> device 21 osd.21 >>> device 22 osd.22 >>> device 23 osd.23 >>> >>> # types >>> type 0 osd >>> type 1 host >>> type 2 rack >>> type 3 row >>> type 4 room >>> type 5 datacenter >>> type 6 pool >>> >>> # buckets >>> host b1 { >>> id -2 # do not change unnecessarily >>> # weight 2.980 >>> alg straw >>> hash 0 # rjenkins1 >>> item osd.0 weight 0.500 >>> item osd.1 weight 0.500 >>> item osd.2 weight 0.500 >>> item osd.3 weight 0.500 >>> item osd.4 weight 0.500 >>> item osd.20 weight 0.480 >>> } >>> host b2 { >>> id -4 # do not change unnecessarily >>> # weight 4.680 >>> alg straw >>> hash 0 # rjenkins1 >>> item osd.5 weight 0.500 >>> item osd.6 weight 0.500 >>> item osd.7 weight 2.200 >>> item osd.8 weight 0.500 >>> item osd.9 weight 0.500 >>> item osd.21 weight 0.480 >>> } >>> host b3 { >>> id -5 # do not change unnecessarily >>> # weight 3.480 >>> alg straw >>> hash 0 # rjenkins1 >>> item osd.10 weight 0.500 >>> item osd.11 weight 0.500 >>> item osd.12 weight 1.000 >>> item osd.13 weight 0.500 >>> item osd.14 weight 0.500 >>> item osd.22 weight 0.480 >>> } >>> host b4 { >>> id -6 # do not change unnecessarily >>> # weight 3.480 >>> alg straw >>> hash 0 # rjenkins1 >>> item osd.15 weight 0.500 >>> item osd.16 weight 1.000 >>> item osd.17 weight 0.500 >>> item osd.18 weight 0.500 >>> item osd.19 weight 0.500 >>> item osd.23 weight 0.480 >>> } >>> pool default { >>> id -1 # do not change unnecessarily >>> # weight 14.620 >>> alg straw >>> hash 0 # rjenkins1 >>> item b1 weight 2.980 >>> item b2 weight 4.680 >>> item b3 weight 3.480 >>> item b4 weight 3.480 >>> } >>> >>> # rules >>> rule data { >>> ruleset 0 >>> type replicated >>> min_size 2 >>> max_size 10 >>> step take default >>> step chooseleaf firstn 0 type host >>> step emit >>> } >>> rule metadata { >>> ruleset 1 >>> type replicated >>> min_size 2 >>> max_size 10 >>> step take default >>> step chooseleaf firstn 0 type host >>> step emit >>> } >>> rule rbd { >>> ruleset 2 >>> type replicated >>> min_size 1 >>> max_size 10 >>> step take default >>> step chooseleaf firstn 0 type host >>> step emit >>> } >>> >>> # end crush map >>> >>> >>> Thanks, >>> Bryan >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com