RE: scrub test assert

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Greg,
I think one point is missing in the description..

After deleting pg entries, 1 PG becomes inconsistent (as expected).

Ran pg repair and the cluster became consistent again , all pgs are active + clean

But, after restart we got the assert.

So, yes, it seems there are some cache effect and not everything persisted during pg repair (?)..

I guess, the question is, if after PG repair is successful, why it is giving the assert during restart ? There shouldn't be any missing files , am I missing anything ?

Thanks & Regards
Somnath

-----Original Message-----
From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel-owner@xxxxxxxxxxxxxxx] On Behalf Of Gregory Farnum
Sent: Thursday, January 07, 2016 12:08 PM
To: Evgeniy Firsov
Cc: ceph-devel@xxxxxxxxxxxxxxx
Subject: Re: scrub test assert

On Thu, Jan 7, 2016 at 1:17 AM, Evgeniy Firsov <Evgeniy.Firsov@xxxxxxxxxxx> wrote:
> Hi, Devs,
>
> We hit an assert while doing scrub test.
> Can you, please, verify if the test case is valid?
>
> The test:
> 1. Start Jewel cluster. 2 nodes, 8 osds each.
> 2. Start fio, pure write workload.
> 3. Delete data of random pg: rm -rf 
> /var/lib/ceph/osd/ceph-7/2.185_head/*
> 4. Stop workload
> 5. Do a scrub on that pg. Cluster active and clean after that.
> 6. Restart the node with that pg.
> 7. The OSD with that pg fails to recover and asserts with:
>      osd/PGLog.cc: 382: FAILED assert(objiter->second->version >
> last_divergent_update)

The OSD is passing scrub tests because of the file descriptor cache (so it still has its own fds open for the deleted files that it references), but then on restart everything gets wiped away and it discovers its data store is inconsistent. I believe your team has run into this before when designing data loss tests. :) -Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at  http://vger.kernel.org/majordomo-info.html
��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux