Fresh deploy of ceph 0.83 has OSD down

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/08/14 11:06, Mark Kirkwood wrote:
> Hi,
>
> I'm doing a fresh install of ceph 0.83 (src build) to an Ubuntu 14.04 VM
> using ceph-deploy 1.59. Everything goes well until the osd creation,
> which fails to start with a journal open error. The steps are shown
> below (ceph is the deploy target host):

> $ tail ceph.osd.0.log
> 2014-08-07 10:47:45.350623 7ffe95e05800  1 journal _open
> /var/lib/ceph/osd/ceph-0/journal fd 20: 2147483648 bytes, block size
> 4096 bytes, directio = 1, aio = 1
> 2014-08-07 10:47:45.351364 7ffe95e05800 -1 journal read_header error
> decoding journal header
> 2014-08-07 10:47:45.351398 7ffe95e05800 -1
> filestore(/var/lib/ceph/osd/ceph-0) mount failed to open journal
> /var/lib/ceph/osd/ceph-0/journal: (22) Invalid argument
>

Doing some more analysis pointed to something fishy with separate 
journals. In an effort to get to the root cause I decided create a ceph 
setup directly via a simple script (attached) and so avoid issues about 
whether ceph-deploy was a factor.

It quickly emerged that the issue was todo with a recent commit 
concerning journals. In a simplified test case where I'm trying to 
create a single osd with a separate device (or device partition) for a 
journal I'm seeing hangs or 'invalid argument 22' errors. Using ceph 
version 0.83-611-g4d2d4dd if I revert commit 4eb18dd I find that I can 
suddenly get osds up in a fresh install [1], and in upgrades I can 
rescue osd's with device journals that refuse to start *provided* I 
recreate the journal (yes that's a little strange, possibly some more 
commits to examine...but at least I can get 'em started)!

regards

Mark

[1] Specifically in fresh install hanging on a mutex at osd mkfs step:

futex(0x7fffaa3fcbac, FUTEX_WAIT_PRIVATE, 1, NULL...)

and in existing setups seeing:

filestore(/var/lib/ceph/osd/ceph-0) mount failed to open journal 
/var/lib/ceph/osd/ceph-0/journal: (22) Invalid argument

as indicated above.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: deploy.sh
Type: application/x-shellscript
Size: 2755 bytes
Desc: not available
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140811/7fcd6c5b/attachment.bin>


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux