Re: default min_size for erasure pools

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 9, 2016 at 6:25 AM, Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
> Hi,
>
> For replicated pools we default to min_size=2 when size=3
> (size-size/2) in order to avoid the split brain scenario, for example
> as described here:
> http://www.spinics.net/lists/ceph-devel/msg27008.html
>
> But for erasure pools we default to min_size=k which I think is a
> recipe for similar problems.
>
> Shouldn't we default to at least min_size=k+1??
>
> diff --git a/src/mon/OSDMonitor.cc b/src/mon/OSDMonitor.cc
> index 77e26de..5d51686 100644
> --- a/src/mon/OSDMonitor.cc
> +++ b/src/mon/OSDMonitor.cc
> @@ -4427,7 +4427,7 @@ int OSDMonitor::prepare_pool_size(const unsigned
> pool_type,
>        err = get_erasure_code(erasure_code_profile, &erasure_code, ss);
>        if (err == 0) {
>         *size = erasure_code->get_chunk_count();
> -       *min_size = erasure_code->get_data_chunk_count();
> +       *min_size = erasure_code->get_data_chunk_count() + 1;
>        }
>      }
>      break;

Well, losing any OSDs at that point would be bad since it would become
inaccessible until you get that whole set back, but there's not really
any chance of serving up bad reads like Sam is worried about in the
ReplicatedPG case. (...at least, assuming you have more data chunks
than parity chunks.) Send in a PR on github?
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux