Re: Understanding/correcting sudden onslaught of unfound objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 15, 2018 at 9:41 AM Graham Allan <gta@xxxxxxx> wrote:
Hi Greg,

On 02/14/2018 11:49 AM, Gregory Farnum wrote:
>
> On Tue, Feb 13, 2018 at 8:41 AM Graham Allan <gta@xxxxxxx
> <mailto:gta@xxxxxxx>> wrote:
>
>     I'm replying to myself here, but it's probably worth mentioning that
>     after this started, I did bring back the failed host, though with "ceph
>     osd weight 0" to avoid more data movement.
>
>     For inconsistent pgs containing unfound objects, the output of "ceph pg
>     <n> query" does then show the original osd being queried for objects,
>     and indeed if I dig through the filesystem I find the same 0-byte files
>     dated from 2015-2016.
>
>     This strongly implies to me that data loss occurred a long time in the
>     past and is not related to the osd host going down - this only triggered
>     the problem being found.
>
>
> I would assume that too, but unless you had scrubbing disabled then it
> should have been discovered long ago; I don’t understand how it could
> have stayed hidden. Did you change any other settings recently?
>
> Or, what is this EC pool being used for, and what are the EC settings?
> Having a bunch of empty files is not surprising if the objects are
> smaller than the chunk/stripe size — then just the primary and the
> parity locations would actually have data for them.

We didn't have scrubbing disabled -it /was/ dialed back somewhat using
the "osd scrub sleep|priority|load_threshold" parameters but I would see
it running, and keeping an eye on the last scrub times for pgs looked
ok. (I guess it is appropriate to revisit and maybe remove "tuning
tweaks" such as the above, as ceph defaults change/improve over time.)

The (4+2) EC pool is used as backing for radosgw - the pool itself
contains ~400T data in ~300M objects.

Having examined some specific "unfound" objects in some detail, I am
less pessimistic but even more confused.

For a huge number of the objects I find they are also listed as 0-bytes
in size when listed via s3 - and these are grouped by "directory" (path)
so it really makes me question whether these were ever successful
uploads. It would explain the 0-byte shard files I see in filestore.

However some objects definitely do contain data. I have some identified
from last week as unfound, which I traced to 6x 0-bytes filestore files,
and attempting to download them from s3 would simply stall (I do need to
follow up on radosgw logs for this case).

However these same test objects now *do* downloads successfully. The pg
itself has reverted to active+clean+inconsistent state. The files
contain image data so I can load them and see they are as expected. If I
again trawl through all the pg filestore locations, I can still only
find the six 0-byte files - where is the data coming from!?

Well, if the objects were uploaded using multi-part upload I believe the objects you’re looking at here will only contain omap (or xattr?) entries pointing to the part files, so the empty file data is to be expected. This might also make slightly more sense in terms of the scrub inconsistencies popping up, although I didn’t think any omap issues I remember should have impacted rgw.

Other than that, I’m not sure how it would be turning 0 bytes of data into the correct results.




Here's an example (first 3 osds for the pg, the other 3 are the same):
> root@cephmon1:~# ssh ceph03 find /var/lib/ceph/osd/ceph-295/current/70.3d6s0_head -name '*1089213*' -exec ls -ltr {} +
> -rw-r--r-- 1 root root 0 Jan 30 13:50 /var/lib/ceph/osd/ceph-295/current/70.3d6s0_head/DIR_6/DIR_D/DIR_3/DIR_0/DIR_0/DIR_4/default.325674.85\ubellplants\uimages\s1089213.jpg__head_794003D6__46_ffffffffffffffff_0
> root@cephmon1:~# ssh ceph01 find /var/lib/ceph/osd/ceph-221/current/70.3d6s1_head -name '*1089213*' -exec ls -ltr {} +
> -rw-r--r-- 1 root root 0 Aug 22  2015 /var/lib/ceph/osd/ceph-221/current/70.3d6s1_head/DIR_6/DIR_D/DIR_3/DIR_0/DIR_0/DIR_4/default.325674.85\ubellplants\uimages\s1089213.jpg__head_794003D6__46_ffffffffffffffff_1
> root@cephmon1:~# ssh ceph08 find /var/lib/ceph/osd/ceph-357/current/70.3d6s2_head -name '*1089213*' -exec ls -ltr {} +
> -rw-r--r-- 1 root root 0 Feb 17  2016 /var/lib/ceph/osd/ceph-357/current/70.3d6s2_head/DIR_6/DIR_D/DIR_3/DIR_0/DIR_0/DIR_4/default.325674.85\ubellplants\uimages\s1089213.jpg__head_794003D6__46_ffffffffffffffff_2

Thanks for any insights!

Graham
--
Graham Allan
Minnesota Supercomputing Institute - gta@xxxxxxx
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux