On 06/02/17 12:06, koukou73gr wrote: > Thanks for the reply. > > Easy? > Sure, it happens reliably every time I boot the guest with > exclusive-lock on :) If it's that easy, also try with only exclusive-lock, and not object-map nor fast-diff. And also with one or the other of those. > > I'll need some walkthrough on the gcore part though! gcore is pretty easy... just do like: gcore -o "$outfile" "$pid" And then to upload it to the devs in some *sorta private way ceph-post-file -d "gcore dump of hung qemu process with exclusive-lock" "$outfile" * sorta private warning from ceph-post-file: > WARNING: > Basic measures are taken to make posted data be visible only to > developers with access to ceph.com infrastructure. However, users > should think twice and/or take appropriate precautions before > posting potentially sensitive data (for example, logs or data > directories that contain Ceph secrets). > -K. > > > On 2017-06-02 12:59, Peter Maloney wrote: >> On 06/01/17 17:12, koukou73gr wrote: >>> Hello list, >>> >>> Today I had to create a new image for a VM. This was the first time, >>> since our cluster was updated from Hammer to Jewel. So far I was just >>> copying an existing golden image and resized it as appropriate. But this >>> time I used rbd create. >>> >>> So I "rbd create"d a 2T image and attached it to an existing VM guest >>> with librbd using: >>> <disk type='network' device='disk'> >>> <driver name='qemu'/> >>> <auth username='lalala'> >>> <secret type='ceph' uuid='uiduiduid'/> >>> </auth> >>> <source protocol='rbd' name='libvirt-pool/srv-10-206-123-87.mails'/> >>> <target dev='sdc' bus='scsi'/> >>> <address type='drive' controller='0' bus='0' target='1' unit='0'/> >>> </disk> >>> >>> >>> Booted the guest and tried to partition it the new drive from inside the >>> guest. That's it, parted (and anything else for that matter) that tried >>> to access the new disk would freeze. After 2 minutes the kernel would >>> start complaining: >>> >>> [ 360.212391] INFO: task parted:1836 blocked for more than 120 seconds. >>> [ 360.216001] Not tainted 4.4.0-78-generic #99-Ubuntu >>> [ 360.218663] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" >>> disables this message. >> Is it easy for you to reproduce it? I had the same problem, and the same >> solution. But it isn't easy to reproduce... Jason Dillaman asked me for >> a gcore dump of a hung process but I wasn't able to get one. Can you do >> that, and when you reply, CC Jason Dillaman <jdillama@xxxxxxxxxx> ? > -- -------------------------------------------- Peter Maloney Brockmann Consult Max-Planck-Str. 2 21502 Geesthacht Germany Tel: +49 4152 889 300 Fax: +49 4152 889 333 E-mail: peter.maloney@xxxxxxxxxxxxxxxxxxxx Internet: http://www.brockmann-consult.de -------------------------------------------- _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com