Re: Ceph and its failures

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Did another test with same scenario but I was wandering about deep-scrub.

So after corrupted CEPH PG I tried to strace

First strace with no error detection

# ceph pg deep-scrub 0.1e

7f0492f94700  0 log_channel(cluster) log [INF] : 0.1e deep-scrub starts
7f049078f700  0 log_channel(cluster) log [INF] : 0.1e deep-scrub ok

# strace -e open,access -p 18756 -ff
Process 18756 attached with 85 threads
[pid 18812] open("/proc/loadavg", O_RDONLY) = 54
[pid 18812] open("/proc/loadavg", O_RDONLY) = 54
[pid 18812] open("/proc/loadavg", O_RDONLY) = 54
[pid 18812] open("/proc/loadavg", O_RDONLY) = 54
[pid 18768] open("/proc/loadavg", O_RDONLY) = 54
[pid 18768] open("/proc/loadavg", O_RDONLY) = 54
[pid 18768] open("/proc/loadavg", O_RDONLY) = 54
[pid 18812] open("/proc/loadavg", O_RDONLY) = 54
[pid 18768] open("/proc/loadavg", O_RDONLY) = 54
[pid 18812] open("/proc/loadavg", O_RDONLY) = 54
[pid 18812] open("/proc/loadavg", O_RDONLY) = 54
[pid 18768] open("/proc/loadavg", O_RDONLY) = 54


After restarting osd

# ceph pg deep-scrub 0.1e

7f0492f94700  0 log_channel(cluster) log [INF] : 0.1e deep-scrub starts
7f0492f94700 -1 log_channel(cluster) log [ERR] : 0.1e shard 0: soid 0/227e76de/rb.0.107a.2ae8944a.000000000000/head data_digest 0x80c0517f != best guess data_digest 0x7b027e29 from auth shard 1, missing attr _, missing attr snapset
7f0492f94700 -1 log_channel(cluster) log [ERR] : 0.1e deep-scrub 0 missing, 1 inconsistent objects
7f0492f94700 -1 log_channel(cluster) log [ERR] : 0.1e deep-scrub 1 errors
7f0492f94700  0 log_channel(cluster) log [INF] : 0.1e repair starts
7f0492f94700 -1 log_channel(cluster) log [ERR] : 0.1e shard 0: soid 0/227e76de/rb.0.107a.2ae8944a.000000000000/head data_digest 0x80c0517f != best guess data_digest 0x7b027e29 from auth shard 1, missing attr _, missing attr snapset
7f0492f94700 -1 log_channel(cluster) log [ERR] : 0.1e repair 0 missing, 1 inconsistent objects
7f0492f94700 -1 log_channel(cluster) log [ERR] : 0.1e repair 1 errors, 1 fixed


# strace -e open,access -p 23471 -ff
Process 23471 attached with 81 threads
[pid 23527] open("/proc/loadavg", O_RDONLY) = 25
[pid 23527] open("/proc/loadavg", O_RDONLY) = 25
[pid 23483] open("/proc/loadavg", O_RDONLY) = 25
[pid 23514] open("/var/lib/ceph/osd/nmz-0/current/0.1e_head/rb.0.107a.2ae8944a.000000000007__head_85D5649E__0", O_RDWR) = 25
[pid 23514] open("/var/lib/ceph/osd/nmz-0/current/0.1e_head/rb.0.107a.2ae8944a.000000000041__head_821CF99E__0", O_RDWR) = 92
[pid 23514] open("/var/lib/ceph/osd/nmz-0/current/0.1e_head/rb.0.107a.2ae8944a.00000000000b__head_C5B1415E__0", O_RDWR) = 93
[pid 23514] open("/var/lib/ceph/osd/nmz-0/current/0.1e_head/rb.0.107a.2ae8944a.000000000000__head_227E76DE__0", O_RDWR) = 94
[pid 23527] open("/proc/loadavg", O_RDONLY) = 95
[pid 23483] open("/proc/loadavg", O_RDONLY) = 95
[pid 23527] open("/proc/loadavg", O_RDONLY) = 95


So why in the firs deep-scrub OSD did nothing with the 0.1e PG ?

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux