Erasure code failure

Jorge Pinilla López <jorpilo@xxxxxxxxx> · Thu, 19 Oct 2017 22:48:57 +0200



    Imagine we have a 3 OSDs cluster and I make an erasure pool with k=2
    m=1.

    
    If I have an OSD fail, we can rebuild the data but (I think) the
    hole cluster won't be able to perform IOS.

    
    Wouldn't be possible to make the cluster work in a degraded mode? 

    I think it would be a good idea to make the cluster work on degraded
    mode and promise to re balance/re build whenever a third OSD comes
    alive. 

    On reads, it could serve the data using the live data chunks and
    rebuilding (if necessary) the missing ones(using cpu to calculate
    the data before serving// with 0 RTA) or trying to rebuild the
    missing parts so it actually has the 2 data chunks on the 2 live
    OSDs (with some RTA and space usage) or even doing both things at
    the same time (with high network and cpu and storage cost).

    On writes, it could write the 2 data parts into the live OSDs and
    whenever the third OSD comes up, the cluster could re balance
    rebuilding the parity chunk and re positioning the parts so all OSDs
    have the same amount of data/work.

    
    would this be possible?

    
      Jorge Pinilla López

      jorpilo@xxxxxxxxx

      Estudiante de ingenieria informática

      Becario del area de sistemas (SICUZ)

      Universidad de Zaragoza

      PGP-KeyID: A34331932EBC715A

      
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com