RE: Upgrade/rollback

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



----------------------------------------
> Date: Thu, 12 Feb 2015 06:57:19 -0800
> Subject: Re: Upgrade/rollback
> From: greg@xxxxxxxxxxx
> To: yguang11@xxxxxxxxxxx
> CC: ceph-devel@xxxxxxxxxxxxxxx
>
> On Thu, Feb 12, 2015 at 12:48 AM, GuangYang <yguang11@xxxxxxxxxxx> wrote:
>> Thanks Sage and Greg for the response.
>>
>>> 2) having a separate switchover point (besides the code upgrade) which
>>> enables all the disk change bits and which doesn't allow you to roll
>>> back.
>> Let me give two examples which prevent us rollback from Giant to Firefly.
>>
>> Example #1:
>> In Giant, there is a new feature flag 'CEPH_FEATURE_ERASURE_CODE_PLUGINS_V2' added/persisted, and monitor would check the persisted list against the list released along with the software version upon starting, it refuse to start if the list mismatch. However, although the feature is added in Giant, it is not being used until we create a new pool with the profile, which is very unlikely to happen.
>> 1) is it possible to persist the new feature bit when the feature is being used (this looks like complicated to implement). 2) When loading the persisted bit, is it possible to check if it is actually used by someone?
>>
>> Example #2:
>> Patch [1] added a new k/v to the PG log which cannot be recognized by old version of binary (PGLog::read_log), as a result, it takes the newly added entry as a pg_log_entry.
>> Is it possible to recognize pg_log_entry with a concrete pattern and just ignore those that the binary cannot recognize?
>>
>>
>> For there two cases, we may be able to erase the newly added entries and then roll back (correct me if I am wrong here), but I think there might be more complicated cases which make the rollback impossible. And accept that risk for upgrading.
>
> For these two specific cases, maybe. But you're missing more
> fundamental things: often changes to data structures are about
> behavior changes that the daemon needs to understand in order to make
> any sense of the data. For instance, any upgrades to CRUSH need to be
> understood by everybody participating in the cluster. We could
> narrowly have the parsing code ignore anything it doesn't understand,
> but then when it does calculations about past_intervals or current
> mappings it would be wrong!
>
> Or in the example #2 you have, the extra data is a bug fix that
> prevents the OSD doing extra work. But what if it was actually about
> changing the shared PG state? In that case you might have OSDs with
> their PG in different states depending on how far they'd gotten when
> rolled back to the old code.
>
> It's just not a feasible problem, because while there are plenty of
> things that we could route around, we still face the inevitable
> collision point I described when we do add data of shared import. :(
That makes total sense. Thanks Greg!

> -Greg
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
 		 	   		  ?韬{.n?????%??檩??w?{.n????u朕?Ф?塄}?财??j:+v??????2??璀??摺?囤??z夸z罐?+?????w棹f





[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux