Re: [PATCH] reinstate ceph cluster_snap support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Dec 17, 2013, Alexandre Oliva <oliva@xxxxxxx> wrote:

>> Finally, eventually we should make this do a checkpoint on the mons too.  
>> We can add the osd snapping back in first, but before this can/should 
>> really be used the mons need to be snapshotted as well.  Probably that's 
>> just adding in a snapshot() method to MonitorStore.h and doing either a 
>> leveldb snap or making a full copy of store.db... I forget what leveldb is 
>> capable of here.

> I haven't looked into this yet.

I looked a bit at the leveldb interface.  It offers a facility to create
Snapshots, but they only last for the duration of one session of the
database.  It can be used to create multiple iterators at once state of
the db, or to read multiple values from the same state of the db, but
not to roll back to a state you had at an earlier session, e.g., after a
monitor restart.  So they won't help us.

I thus see a few possibilities (all of them to be done between taking
note of the request for the new snapshot and returning a response to the
requestor that the request was satisfied):

1. take a snapshot, create an iterator out of the snapshot, create a new
database named after the cluster_snap key, and go over all key/value
pairs tha the iterator can see, adding each one to this new database.

2. close the database, create a dir named after the cluster_snap key,
create hardlinks to all files in the database tree in the cluster_snap
dir, and then reopen the database

3. flush the leveldb (how?  will a write with sync=true do?  must we
close it?) and take a btrfs snapshot of the store.db tree, named after
the cluster_snap key, and then reopen the database

None of these are particularly appealing; (1) wastes disk space and cpu
cycles; (2) relies on leveldb internal implementation details such as
the fact that files are never modified after they're first closed, and
(3) requires a btrfs subvol for the store.db.  My favorite choice would
be 3, but can we just fail mon snaps when this requirement is not met?

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist      Red Hat Brazil Compiler Engineer
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux