Hi, I was wondering, why did you use CephFS instead of RBD? RBD is much more reliable and well integrated with QEMU/KVM. Or perhaps you want to try CephFS? –––– Sébastien Han Cloud Engineer "Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien.han@xxxxxxxxxxxx Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On October 11, 2013 at 4:47:58 AM, Frerot, Jean-Sébastien (jsfrerot@xxxxxxxxxxxxxxxx) wrote: > >Hi, >I followed this documentation and didn't specify any CRUSH settings. > >http://ceph.com/docs/next/rbd/rbd-openstack/ > >-- >Jean-Sébastien Frerot >jsfrerot@xxxxxxxxxxxxxxxx > > >2013/10/10 Gregory Farnum > >> Okay. As a quick guess you probably used a CRUSH placement option with >> your new pools that wasn't supported by the old kernel, although it >> might have been something else. >> >> I suspect that you'll find FUSE works better for you anyway as long as >> you can use it — faster updates from us to you. ;) >> -Greg >> Software Engineer #42 @ http://inktank.com | http://ceph.com >> >> >> On Thu, Oct 10, 2013 at 10:53 AM, Frerot, Jean-Sébastien >> wrote: >> > Hi, >> > Thx for your reply :) >> > >> > kernel: Linux compute01 3.8.0-31-generic #46-Ubuntu SMP Tue Sep 10 >> 20:03:44 >> > UTC 2013 x86_64 x86_64 x86_64 GNU/Linux >> > >> > So yes I'm using cephfs and was also using rdb at the same time using >> > different pools. My ceph fs was setup 3 months ago and I upgraded it a >> > couple of days ago. I move VM images from rdb to the cephfs by copying >> the >> > file from rdb to local FS then to cephfs. >> > >> > I create pools like this: >> > ceph osd pool create volumes 128 >> > ceph osd pool create images 128 >> > ceph osd pool create live_migration 128 >> > >> > Yes I had checked dmesg but didn't find anything relevant. >> > >> > However, as a last resort I decided to mount my FS using fuse. And it >> works >> > like a charm. So for now I'm sticking with fuse :) >> > >> > Let me know if you want me to do some explicit testing. It may take some >> > time for me to do them as I'm using ceph but I can manage to have some >> time >> > for maintenances. >> > >> > Regards, >> > >> > >> > -- >> > Jean-Sébastien Frerot >> > jsfrerot@xxxxxxxxxxxxxxxx >> > >> > >> > 2013/10/10 Gregory Farnum >> >> >> >> (Sorry for the delayed response, this was in my spam folder!) >> >> >> >> Has this issue persisted? Are you using the stock 13.04 kernel? >> >> >> >> Can you describe your setup a little more clearly? It sounds like >> >> maybe you're using CephFS now and were using rbd before; is that >> >> right? What data did you move, when, and how did you set up your >> >> CephFS to use the pools? >> >> The socket errors are often a slightly spammy notification that the >> >> socket isn't in use but has shut down; here they look to be an >> >> indicator of something actually gone wrong — perhaps you've >> >> inadvertently activated features incompatible with your kernel client, >> >> but let's see what's going on more before we jump to that conclusion. >> >> have you checked dmesg for anything else at those points? >> >> -Greg >> >> Software Engineer #42 @ http://inktank.com | http://ceph.com >> >> >> >> On Sat, Oct 5, 2013 at 6:42 PM, Frerot, Jean-Sébastien >> >> wrote: >> >> > Hi, >> >> > I have a ceph cluster running with 3 physical servers, >> >> > >> >> > Here is how my setup is configured >> >> > server1: mon, osd, mds >> >> > server2: mon, osd, mds >> >> > server3: mon >> >> > OS ubuntu 13.04 >> >> > ceph version: 0.67.4-1raring (recentrly upgrade to see if my problem >> >> > still >> >> > persisted with the new version) >> >> > >> >> > So I was running version CUTTLEFISH until yesterday. And I was using >> >> > ceph >> >> > with openstack (using rdb) but I simplified my setup and removed >> >> > openstack >> >> > to simply use kvm with virtmanager. >> >> > >> >> > So I created a new pool to be able to do live migration of kvm >> instances >> >> > #ceph osd lspools >> >> > 0 data,1 metadata,2 rbd,3 volumes,4 images,6 live_migration, >> >> > >> >> > I've been running VMs for some days without problems, but then I >> notice >> >> > that >> >> > I couldn't use the full disk size of my first VM (web01 which was 160G >> >> > big >> >> > originaly) but now is only 119G stored in ceph. I also have a windows >> >> > instance running on a 300G raw file located in ceph too. So trying to >> >> > fix >> >> > the issue I decided to do a local backup of my file in cause something >> >> > goes >> >> > wrong and guess what, i wasn't able to copy the file from ceph to my >> >> > local >> >> > drive. The moment I tried to do that "cp live_migration/web01 /mnt/" >> the >> >> > OS >> >> > hangs, and syslog show this >30 lines/s: >> >> > >> >> > Oct 5 15:25:45 server2 kernel: [ 8773.432358] libceph: osd1 >> >> > 192.168.0.131:6803 socket error on read >> >> > >> >> > i couldn't kill my cp neither normally reboot my server. So I had to >> >> > reset >> >> > it. >> >> > >> >> > I tried to copy my other file "win2012" also stored in the ceph >> cluster >> >> > and >> >> > get the same issue and now I can't read anything from it nor start my >> VM >> >> > again >> >> > >> >> > [root@server1 ~]# ceph status >> >> > cluster 50dc0404-c081-4c43-ac3f-872ba5494bd7 >> >> > health HEALTH_OK >> >> > monmap e4: 3 mons at >> >> > >> >> > {server1= >> 192.168.0.130:6789/0,server2=192.168.0.131:6789/0,server3=192.168.0.132:6789/0 >> }, >> >> > election epoch 120, quorum 0,1,2 server1,server2,server3 >> >> > osdmap e275: 2 osds: 2 up, 2 in >> >> > pgmap v1508209: 576 pgs: 576 active+clean; 108 GB data, 214 GB >> used, >> >> > 785 >> >> > GB / 999 GB avail >> >> > mdsmap e181: 1/1/1 up {0=server2=up:active}, 1 up:standby >> >> > >> >> > I mount the FS with fstab like this: >> >> > 192.168.0.131:6789,192.168.0.130:6789:/live_migration >> /var/lib/instances >> >> > ceph name=live_migration,secret=mysecret==,noatime 0 2 >> >> > >> >> > I get this log in ceph-osd.0.log as spammy as "socket error on read" >> >> > error i >> >> > get in syslog >> >> > 2013-10-05 23:07:23.586807 7f24731cc700 0 -- >> 192.168.0.130:6801/19182 >> >> > >> >> >> > 192.168.0.130:0/4212596483 pipe(0x128d8500 sd=115 :6801 s=0 pgs=0 >> cs=0 >> >> > l=0 >> >> > c=0x14ac09a0).accept peer addr is rea >> >> > lly 192.168.0.130:0/4212596483 (socket is 192.168.0.130:35078/0) >> >> > >> >> > other infos: >> >> > df -h >> >> > /dev/mapper/server1--vg-ceph 500G 108G 393G >> >> > 22% >> >> > /opt/data/ceph >> >> > 192.168.0.131:6789,192.168.0.130:6789:/live_migration 1000G 215G >> 786G >> >> > 22% >> >> > /var/lib/instances >> >> > ... >> >> > >> >> > mount >> >> > /dev/mapper/server1--vg-ceph on /opt/data/ceph type xfs (rw,noatime) >> >> > 192.168.0.131:6789,192.168.0.130:6789:/live_migration on >> >> > /var/lib/instances >> >> > type ceph (name=live_migration,key=client.live_migration) >> >> > ... >> >> > >> >> > >> >> > How can I recover from this ? >> >> > >> >> > Thank you, >> >> > -- >> >> > Jean-Sébastien Frerot >> >> > jsfrerot@xxxxxxxxxxxxxxxx >> >> > >> >> > _______________________________________________ >> >> > ceph-users mailing list >> >> > ceph-users@xxxxxxxxxxxxxx >> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> > >> > >> > >> >_______________________________________________ >ceph-users mailing list >ceph-users@xxxxxxxxxxxxxx >http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com