ceph osd replacement with shared journal device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Owen,

> On 29 Sep 2014, at 10:33, Owen Synge <osynge at suse.com> wrote:
> 
> Hi Dan,
> 
> At least looking at upstream to get journals and partitions persistently
> working, this requires gpt partitions, and being able to add a GPT
> partition UUID to work perfectly with minimal modification.
> 
> I am not sure the status of this on RHEL6, The latest Fedora and
> OpenSUSE support this but SLE12 (To be released) and I think RHEL7 do
> support this.
> 
> Im sure you can bypass this as every data partition contains a symlink
> to the journal partition, but persistent naming may be more work if you
> dont use GPT partitions.

The persistent names and udev triggers all work when I first setup the drives with ceph-disk. The ptables are indeed GPT and the links to the journals are to the persistent by-partuuid links. My setup is like this, and it works perfectly:

ceph-disk prepare /dev/sde /dev/sda
ceph-disk prepare /dev/sdf /dev/sda
ceph-disk prepare /dev/sdg /dev/sda
ceph-disk prepare /dev/sdh /dev/sda
ceph-disk prepare /dev/sdi /dev/sda

(each time ceph-disk creates the next partition on sda and creates the correct persistent links. The udev trigger calls ceph-disk activate and the OSD is eventually started).

My only question is about the replacement procedure (e.g. for sde). The options I?ve seen are
  - ceph-disk prepare /dev/sde /dev/sda  ? this will create a 6th partition on sda
  - ceph-disk prepare /dev/sde /dev/sda1  ? in this case the journal link is to sda1 instead of the persistent link.
  - parted /dev/sda rm 1; ceph-disk prepare /dev/sde /dev/sda ? I thought this was working, but in fact the ptable looks like this afterwards (part #1 is at the end of the disk):

Number  Start   End     Size    File system  Name          Flags
 2      21.5GB  43.0GB  21.5GB               ceph journal
 3      43.0GB  64.4GB  21.5GB               ceph journal
 4      64.4GB  85.9GB  21.5GB               ceph journal
 5      85.9GB  107GB   21.5GB               ceph journal
 1      107GB   129GB   21.5GB               ceph journal

I?m going to trace what is happening with ceph-disk prepare /dev/sde /dev/sda1 and try to coerce that to use the persistent name.

Cheers, Dan





> 
> Best of luck.
> 
> Owen
> 
> 
> 
> 
> 
> On 09/29/2014 10:24 AM, Dan Van Der Ster wrote:
>> Hi,
>> 
>>> On 29 Sep 2014, at 10:01, Daniel Swarbrick <daniel.swarbrick at profitbricks.com> wrote:
>>> 
>>> On 26/09/14 17:16, Dan Van Der Ster wrote:
>>>> Hi,
>>>> Apologies for this trivial question, but what is the correct procedure to replace a failed OSD that uses a shared journal device?
>>>> 
>>>> I?m just curious, for such a routine operation, what are most admins doing in this case?
>>>> 
>>> 
>>> I think ceph-osd is what you need.
>>> 
>>> ceph-osd -i <osd id> ?mkjournal
>> 
>> 
>> At the moment I am indeed using this command to in our puppet manifests for creating and replacing OSDs. But now I?m trying to use the ceph-disk udev magic, since it seems to be the best (perhaps only?) way to get persistently named OSD and journal devs (on RHEL6).
>> 
>> Cheers, Dan
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users at lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux