Re: hammer - lost object after just one OSD failure?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 4, 2016 at 12:00 AM, Nikola Ciprich
<nikola.ciprich@xxxxxxxxxxx> wrote:
> Hi,
>
> I was doing some performance tuning on test cluster of just 2
> nodes (each 10 OSDs). I have test pool of 2 replicas (size=2, min_size=2)
>
> then one of OSD crashed due to failing harddrive. All remaining OSDs were
> fine, but health status reported one lost object..
>
> here's detail:
>
>     "recovery_state": [
>         {
>             "name": "Started\/Primary\/Active",
>             "enter_time": "2016-05-04 07:59:10.706866",
>             "might_have_unfound": [
>                 {
>                     "osd": "0",
>                     "status": "osd is down"
>                 },
>                 {
>                     "osd": "10",
>                     "status": "already probed"
>                 }
>             ],
>
>
> it was no important data, so  I just discarded it as I don't need
> to recover it, but now I'm wondering what is the cause of all this..
>
> I have min_size set to 2 and I though that writes are confirmed after
> they reach all target OSD journals, no? Is there something specific I should
> check? Maybe I have some bug in configuration? Or how else could this object
> be lost?

Is OSD 0 the one which had a failing hard drive? And OSD 10 is
supposed to be fine?

In general what you're saying does make it sound like something under
the Ceph code lost objects, but if one of those OSDs has never had a
problem I'm not sure what it could be.

(The most common failure mode is power loss while the user has
barriers turned off, or a RAID card misconfigured, or similar.)
-Greg

>
> I'd be grateful for any info
>
> br
>
> nik
>
>
>
>
>
> --
> -------------------------------------
> Ing. Nikola CIPRICH
> LinuxBox.cz, s.r.o.
> 28.rijna 168, 709 00 Ostrava
>
> tel.:   +420 591 166 214
> fax:    +420 596 621 273
> mobil:  +420 777 093 799
> www.linuxbox.cz
>
> mobil servis: +420 737 238 656
> email servis: servis@xxxxxxxxxxx
> -------------------------------------
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux