On 09/10/2014 16:29, SCHAER Frederic wrote: > > > -----Message d'origine----- > De : Loic Dachary [mailto:loic@xxxxxxxxxxx] > Envoyé : jeudi 9 octobre 2014 16:20 > À : SCHAER Frederic; ceph-users@xxxxxxxxxxxxxx > Objet : Re: ceph-dis prepare : UUID=00000000-0000-0000-0000-000000000000 > > > > On 09/10/2014 16:04, SCHAER Frederic wrote: >> Hi Loic, >> >> Back on sdb, as the sde output was from another machine on which I ran partx -u afterwards. >> To reply your last question first : I think the SG_IO error comes from the fact that disks are exported as a single disks RAID0 on a PERC 6/E, which does not support JBOD - this is decommissioned hardware on which I'd like to test and validate we can use ceph for our use case... >> >> So back on the UUID. >> It's funny : I retried and ceph-disk prepare worked this time. I tried on another disk, and it failed. >> There is a difference in the output from ceph-disk : on the failing disk, I have these extra lines after disks are prepared : >> >> (...) >> realtime =none extsz=4096 blocks=0, rtextents=0 >> Warning: The kernel is still using the old partition table. >> The new table will be used at the next reboot. >> The operation has completed successfully. >> partx: /dev/sdc: error adding partitions 1-2 >> >> I didn't have the warning about the old partition tables on the disk that worked. >> So on this new disk, I have : >> >> [root@ceph1 ~]# mount /dev/sdc1 /mnt >> [root@ceph1 ~]# ll /mnt/ >> total 16 >> -rw-r--r-- 1 root root 37 Oct 9 15:58 ceph_fsid >> -rw-r--r-- 1 root root 37 Oct 9 15:58 fsid >> lrwxrwxrwx 1 root root 58 Oct 9 15:58 journal -> /dev/disk/by-partuuid/5e50bb8b-0b99-455f-af71-10815a32bfbc >> -rw-r--r-- 1 root root 37 Oct 9 15:58 journal_uuid >> -rw-r--r-- 1 root root 21 Oct 9 15:58 magic >> >> [root@ceph1 ~]# cat /mnt/journal_uuid >> 5e50bb8b-0b99-455f-af71-10815a32bfbc >> >> [root@ceph1 ~]# sgdisk --info=1 /dev/sdc >> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown) >> Partition unique GUID: 244973DE-7472-421C-BB25-4B09D3F8D441 >> First sector: 10487808 (at 5.0 GiB) >> Last sector: 1952448478 (at 931.0 GiB) >> Partition size: 1941960671 sectors (926.0 GiB) >> Attribute flags: 0000000000000000 >> Partition name: 'ceph data' >> >> [root@ceph1 ~]# sgdisk --info=2 /dev/sdc >> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown) >> Partition unique GUID: 5E50BB8B-0B99-455F-AF71-10815A32BFBC >> First sector: 2048 (at 1024.0 KiB) >> Last sector: 10485760 (at 5.0 GiB) >> Partition size: 10483713 sectors (5.0 GiB) >> Attribute flags: 0000000000000000 >> Partition name: 'ceph journal' >> >> Puzzling, isn't it ? >> >> > > Yes :-) Just to be 100% sure, when you try to activate this /dev/sdc it shows an error and complains that the journal uuid is 0000-000* etc ? If so could you copy your udev debug output ? > > Cheers > > [>- FS : -<] > > No, when I manually activate the disk instead of attempting to go the udev way, it seems to work : > [root@ceph1 ~]# ceph-disk activate /dev/sdc1 > got monmap epoch 1 > SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0b 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 2014-10-09 16:21:43.286288 7f2be6a027c0 -1 journal check: ondisk fsid 00000000-0000-0000-0000-000000000000 doesn't match expected 244973de-7472-421c-bb25-4b09d3f8d441, invalid (someone else's?) journal > SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0b 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0b 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0b 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 2014-10-09 16:21:43.301957 7f2be6a027c0 -1 filestore(/var/lib/ceph/tmp/mnt.4lJlzP) could not find 23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory > 2014-10-09 16:21:43.305941 7f2be6a027c0 -1 created object store /var/lib/ceph/tmp/mnt.4lJlzP journal /var/lib/ceph/tmp/mnt.4lJlzP/journal for osd.47 fsid 70ac4a78-46c0-45e6-8ff9-878b37f50fa1 > 2014-10-09 16:21:43.305992 7f2be6a027c0 -1 auth: error reading file: /var/lib/ceph/tmp/mnt.4lJlzP/keyring: can't open /var/lib/ceph/tmp/mnt.4lJlzP/keyring: (2) No such file or directory > 2014-10-09 16:21:43.306099 7f2be6a027c0 -1 created new key in keyring /var/lib/ceph/tmp/mnt.4lJlzP/keyring > added key for osd.47 > === osd.47 === > create-or-move updating item name 'osd.47' weight 0.9 at location {host=ceph1,root=default} to crush map > Starting Ceph osd.47 on ceph1... > Running as unit run-12392.service. > > The osd then appeared in the osd tree... > I attached the logs to this email (I just added a set -x in the script called by udev, and redirected the output) The failure journal check: ondisk fsid 00000000-0000-0000-0000-000000000000 doesn't match expected 244973de-7472-421c-bb25-4b09d3f8d441 and the udev logs DEBUG:ceph-disk:Journal /dev/sdc2 has OSD UUID 00000000-0000-0000-0000-000000000000 means /usr/bin/ceph-osd -i 0 --get-journal-uuid --osd-journal /dev/sdc2 fails to read the OSD UUID from /dev/sdc2 which means something went wrong when preparing the journal. It would be great if you could send the command you used to prepare the disk and the output (verbose if possible). I think you can reproduce the problem by zapping the disk with ceph-disk zap /dev/sdc and running partx -u if the corresponding entries in /dev/disk/by-partuuid have not been removed. That would also help me fix zap in the context of https://github.com/ceph/ceph/pull/2648 ... or have confirmation that it does not need fixing because it updates correctly on RHEL ;-) Cheers -- Loïc Dachary, Artisan Logiciel Libre
Attachment:
signature.asc
Description: OpenPGP digital signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com