Brian,
Never mind...looking back though some older emails I do see an
indication of a problem with that drive.
I will fail out the osd and replace the drive.
Thanks again for the help,
Shian
On 03/17/2017 03:08 PM, Shain Miley
wrote:
|
This sender failed our fraud detection checks and may not be who they appear to be. Learn about spoofing |
Feedback |
Brian,
Thank you for the detailed information. I was able to
compare the 3 hexdump files and it looks like the primary pg
is the odd man out.
I stopped the OSD and then I attempted to move the object:
root@hqosd3:/var/lib/ceph/osd/ceph-32/current/3.2b8_head/DIR_8/DIR_B/DIR_2/DIR_A/DIR_0#
mv rb.0.fe307e.238e1f29.00000076024c__head_4650A2B8__3 /root
mv: error reading
‘rb.0.fe307e.238e1f29.00000076024c__head_4650A2B8__3’:
Input/output error
mv: failed to extend
‘/root/rb.0.fe307e.238e1f29.00000076024c__head_4650A2B8__3’:
Input/output error
However I got a nice Input/output error instead.
I assume that this is not the case normally.
Any ideas on how I should proceed at this point..should I
fail out this OSD and replace the drive (I have had no
indication other than the IO error that there is an issue with
this disk), or is there something I can try first?
Thanks again,
Shain
On 03/17/2017 11:38 AM, Brian
Andrus wrote:
We went through a period of time where we were
experiencing these daily...
cd to the PG directory on each OSD and do a find for " 238e1f29.00000076024c" (mentioned
in your error message). This will likely return a file
that has a slash in the name, something like rbd\udata.238e1f29.00000076024c_head_blah_1f...
hexdump -C the object
(tab completing the name helps) and pipe the output to a
different location. Once you obtain the hexdumps, do a
diff or cmp against them and find which one is not like
the others.
If the primary is not
the outlier, perform the PG repair without worry. If the
primary is the outlier, you will need to stop the OSD,
move the object out of place, start it back up and then
it will be okay to issue a PG repair.
Other less common
inconsistent PGs we see are differing object sizes (easy
to detect with a simple list of file size) and differing
attributes ("attr -l", but the error logs are usually
precise in identifying the problematic PG copy).
--
NPR | Shain Miley | Manager of Infrastructure, Digital Media | smiley@xxxxxxx | 202.513.3649
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
NPR | Shain Miley | Manager of Infrastructure, Digital Media | smiley@xxxxxxx | 202.513.3649
|
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com