Re: Corruption by missing blocks

Gregory Farnum <greg@xxxxxxxxxxx> · Tue, 23 Apr 2013 11:56:55 -0700

On Tue, Apr 23, 2013 at 11:38 AM, Bryan Stillwell
<bstillwell@xxxxxxxxxxxxxxx> wrote:
> I've run into an issue where after copying a file to my cephfs cluster
> the md5sums no longer match.  I believe I've tracked it down to some
> parts of the file which are missing:
>
> $ obj_name=$(cephfs "title1.mkv" show_location -l 0 | grep object_name
> | sed -e "s/.*:\W*\([0-9a-f]*\)\.[0-9a-f]*/\1/")
> $ echo "Object name: $obj_name"
> Object name: 10000001120
>
> $ file_size=$(stat "title1.mkv" | grep Size | awk '{ print $2 }')
> $ printf "File size: %d MiB (%d Bytes)\n" $(($file_size/1048576)) $file_size
> File size: 20074 MiB (21049178117 Bytes)
>
> $ blocks=$((file_size/4194304+1))
> $ printf "Blocks: %d\n" $blocks
> Blocks: 5019
>
> $ for b in `seq 0 $(($blocks-1))`; do rados -p data stat
> ${obj_name}.`printf '%8.8x\n' $b` | grep "error"; done
>  error stat-ing data/10000001120.00001076: No such file or directory
>  error stat-ing data/10000001120.000011c7: No such file or directory
>  error stat-ing data/10000001120.0000129c: No such file or directory
>  error stat-ing data/10000001120.000012f4: No such file or directory
>  error stat-ing data/10000001120.00001307: No such file or directory
>
>
> Any ideas where to look to investigate what caused these blocks to not
> be written?

What client are you using to write this? Is it fairly reproducible (so
you could collect logs of it happening)?

Usually the only times I've seen anything like this were when either
the file data was supposed to go into a pool which the client didn't
have write permissions on, or when the RADOS cluster was in bad shape
and so the data never got flushed to disk. Has your cluster been
healthy since you started writing the file out?
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com

>
> Here's the current state of the cluster:
>
> ceph -s
>    health HEALTH_OK
>    monmap e1: 1 mons at {a=172.24.88.50:6789/0}, election epoch 1, quorum 0 a
>    osdmap e22059: 24 osds: 24 up, 24 in
>     pgmap v1783615: 1920 pgs: 1917 active+clean, 3
> active+clean+scrubbing+deep; 4667 GB data, 9381 GB used, 4210 GB /
> 13592 GB avail
>    mdsmap e437: 1/1/1 up {0=a=up:active}
>
> Here's my current crushmap:
>
> # begin crush map
>
> # devices
> device 0 osd.0
> device 1 osd.1
> device 2 osd.2
> device 3 osd.3
> device 4 osd.4
> device 5 osd.5
> device 6 osd.6
> device 7 osd.7
> device 8 osd.8
> device 9 osd.9
> device 10 osd.10
> device 11 osd.11
> device 12 osd.12
> device 13 osd.13
> device 14 osd.14
> device 15 osd.15
> device 16 osd.16
> device 17 osd.17
> device 18 osd.18
> device 19 osd.19
> device 20 osd.20
> device 21 osd.21
> device 22 osd.22
> device 23 osd.23
>
> # types
> type 0 osd
> type 1 host
> type 2 rack
> type 3 row
> type 4 room
> type 5 datacenter
> type 6 pool
>
> # buckets
> host b1 {
>         id -2           # do not change unnecessarily
>         # weight 2.980
>         alg straw
>         hash 0  # rjenkins1
>         item osd.0 weight 0.500
>         item osd.1 weight 0.500
>         item osd.2 weight 0.500
>         item osd.3 weight 0.500
>         item osd.4 weight 0.500
>         item osd.20 weight 0.480
> }
> host b2 {
>         id -4           # do not change unnecessarily
>         # weight 4.680
>         alg straw
>         hash 0  # rjenkins1
>         item osd.5 weight 0.500
>         item osd.6 weight 0.500
>         item osd.7 weight 2.200
>         item osd.8 weight 0.500
>         item osd.9 weight 0.500
>         item osd.21 weight 0.480
> }
> host b3 {
>         id -5           # do not change unnecessarily
>         # weight 3.480
>         alg straw
>         hash 0  # rjenkins1
>         item osd.10 weight 0.500
>         item osd.11 weight 0.500
>         item osd.12 weight 1.000
>         item osd.13 weight 0.500
>         item osd.14 weight 0.500
>         item osd.22 weight 0.480
> }
> host b4 {
>         id -6           # do not change unnecessarily
>         # weight 3.480
>         alg straw
>         hash 0  # rjenkins1
>         item osd.15 weight 0.500
>         item osd.16 weight 1.000
>         item osd.17 weight 0.500
>         item osd.18 weight 0.500
>         item osd.19 weight 0.500
>         item osd.23 weight 0.480
> }
> pool default {
>         id -1           # do not change unnecessarily
>         # weight 14.620
>         alg straw
>         hash 0  # rjenkins1
>         item b1 weight 2.980
>         item b2 weight 4.680
>         item b3 weight 3.480
>         item b4 weight 3.480
> }
>
> # rules
> rule data {
>         ruleset 0
>         type replicated
>         min_size 2
>         max_size 10
>         step take default
>         step chooseleaf firstn 0 type host
>         step emit
> }
> rule metadata {
>         ruleset 1
>         type replicated
>         min_size 2
>         max_size 10
>         step take default
>         step chooseleaf firstn 0 type host
>         step emit
> }
> rule rbd {
>         ruleset 2
>         type replicated
>         min_size 1
>         max_size 10
>         step take default
>         step chooseleaf firstn 0 type host
>         step emit
> }
>
> # end crush map
>
>
> Thanks,
> Bryan
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com