Re: How to recover from active+clean+inconsistent+failed_repair?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



 Hi Frank

> I'm not sure if my hypothesis can be correct. Ceph sends an acknowledge of a write only after all copies are on disk. In other words, if PGs end up on different versions after a power outage, one always needs to roll back. Since you have two healthy OSDs in the PG and the PG is active (successfully peered), it might just be a broken disk and read/write errors. I would focus on that.

I tried to revert the PG as follows:
# ceph pg 3.b query | grep version        "last_user_version": 2263481,        "version": "4825'2264303",
        "last_user_version": 2263481,        "version": "4825'2264301",
        "last_user_version": 2263481,        "version": "4825'2264301",

ceph pg 3.b list_unfound 
{    "num_missing": 0,    "num_unfound": 0,    "objects": [],    "more": false}

# ceph pg 3.b mark_unfound_lost revertpg has no unfound objects

# ceph pg 3.b revertInvalid command: revert not in querypg <pgid> query :  show details of a specific pgError EINVAL: invalid command

How to revert/rollback a PG?

> Another question, do you have write caches enabled (disk cache and controller cache)? This is know to cause problems on power outages and also degraded performance with ceph. You should check and disable any caches if necessary.

No. HDD is directly connected to motherboard.
Thank you
Sagara

  
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux