On Thursday, 14 October 2021 09:28:12 CEST Michal Prívozník wrote: > On 10/13/21 2:54 PM, Vojtech Juranek wrote: > > Hi, > > I'm trying to find the root cause for BZ #1898049 [1]. When setting up > > Windows HA cluster on Windows Server VMs run on top of oVirt, Windows > > cluster validator runs couple of tests and fails during test "Validate > > SCSI-3 Persistent Reservation" and one of the VMs of> > > the cluster is paused with IO error. Disk definition is as follows: > > <disk type='block' device='lun' sgio='unfiltered' snapshot='no'> > > > > <driver name='qemu' type='raw' cache='none' error_policy='stop' > > io='native'/> <source > > dev='/dev/mapper/3600a09803830447a4f244c4657616f6f' index='1'>> > > <seclabel model='dac' relabel='no'/> > > <reservations managed='yes'> > > > > <source type='unix' > > path='/var/lib/libvirt/qemu/domain-1-Windows-2016-2/pr-helper0. > > sock' mode='client'/>> > > </reservations> > > > > </source> > > <backingStore/> > > <target dev='sdb' bus='scsi'/> > > <shareable/> > > <alias name='ua-26b4975e-e1d4-4e27-b2c6-2ea0894a571b'/> > > <address type='drive' controller='0' bus='0' target='0' unit='1'/> > > > > </disk> > > > > and libvirt error I get is bellow [2]. > > > > When I try to create reservation from Windows VM manually, I get following > > error (but not sure I do it whole process correctly): > > > > .\sg_persist --out --register --param-sark=123abc e: > > QEMU QEMU HARDDISK 2.5+ > > Peripheral device type: disk > > > > PR out (Register): command not supported > > sg_persist failed: Illegal request, Invalid opcode > > > > > > Do you have any ideas what could be wrong or how to determine > > the root cause of this this issue? > > > > Thanks in advance. > > Vojta > > > > > > [1] https://bugzilla.redhat.com/1898049 > > [2] libvirt debug log: > > > > 2021-10-12 11:43:25.148+0000: 2006427: debug : qemuMonitorEmitIOError:1243 > > : mon=0x7fb02006a020 2021-10-12 11:43:25.148+0000: 2006427: info : > > virObjectRef:402 : OBJECT_REF: obj=0x7fb02006a020 2021-10-12 > > 11:43:25.148+0000: 2006427: info : virObjectRef:402 : OBJECT_REF: > > obj=0x7fafd0130020 2021-10-12 11:43:25.148+0000: 2000208: info : > > virObjectRef:402 : OBJECT_REF: obj=0x7fafd010d340 2021-10-12 > > 11:43:25.148+0000: 2006427: info : virObjectNew:258 : OBJECT_NEW: > > obj=0x7fb020082590 classname=virDomainEventIOError 2021-10-12 > > 11:43:25.148+0000: 2000208: info : vir_object_finalize:321 : > > OBJECT_DISPOSE: obj=0x7fb020082500 2021-10-12 11:43:25.148+0000: 2006427: > > info : virObjectNew:258 : OBJECT_NEW: obj=0x7fb020082620 > > classname=virDomainEventIOError 2021-10-12 11:43:25.148+0000: 2000208: > > info : virObjectUnref:380 : OBJECT_UNREF: obj=0x7fb020082500 2021-10-12 > > 11:43:25.148+0000: 2006427: debug : qemuProcessHandleIOError:907 : > > Transitioned guest Windows-2016-2 to paused state due to IO error > > 2021-10-12 11:43:25.148+0000: 2000208: info : virObjectUnref:380 : > > OBJECT_UNREF: obj=0x7fafd010d340 2021-10-12 11:43:25.148+0000: 2006427: > > info : virObjectNew:258 : OBJECT_NEW: obj=0x7fafbc1fb8c0 > > classname=virDomainEventLifecycle 2021-10-12 11:43:25.148+0000: 2006427: > > debug : virDomainLockProcessPause:204 : plugin=0x7fafd01272a0 > > dom=0x7fb01400f5e0 state=0x7fb02401d768 2021-10-12 11:43:25.148+0000: > > 2006427: debug : virDomainLockManagerNew:134 : plugin=0x7fafd01272a0 > > dom=0x7fb01400f5e0 withResources=1 2021-10-12 11:43:25.148+0000: 2006427: > > debug : virLockManagerPluginGetDriver:276 : plugin=0x7fafd01272a0 > > 2021-10-12 11:43:25.148+0000: 2006427: debug : virLockManagerNew:300 : > > driver=0x7fafd444a000 type=0 nparams=5 params=0x7fafd77de640 flags=0x0 > > 2021-10-12 11:43:25.148+0000: 2006427: debug : virLockManagerLogParams:97 > > : key=uuid type=uuid value=70eee88c-ba2c-4c6c-bd51-c2b663db27f8 > > 2021-10-12 11:43:25.148+0000: 2006427: debug : virLockManagerLogParams:90 > > : key=name type=string value=Windows-2016-2 2021-10-12 > > 11:43:25.148+0000: 2006427: debug : virLockManagerLogParams:78 : key=id > > type=uint value=1 2021-10-12 11:43:25.148+0000: 2006427: debug : > > virLockManagerLogParams:78 : key=pid type=uint value=2006418 2021-10-12 > > 11:43:25.148+0000: 2006427: debug : virLockManagerLogParams:93 : > > key=uri type=cstring value=(null) 2021-10-12 11:43:25.148+0000: 2006427: > > debug : virDomainLockManagerNew:146 : Adding leases 2021-10-12 > > 11:43:25.148+0000: 2006427: debug : virDomainLockManagerNew:151 : Adding > > disks 2021-10-12 11:43:25.149+0000: 2006427: debug : > > virDomainLockManagerAddImage:90 : Add disk > > /rhev/data-center/mnt/blockSD/7c4f09b6-9e87-436f-bda9-22d1f0b50955/images > > /f5d6e074-dfe9-462d-8cfd-3e14b0eb5aea/766e36b2-84a6-43e7-a48b-a5f47e669860 > > 2021-10-12 11:43:25.149+0000: 2006427: debug : > > virLockManagerAddResource:326 : lock=0x7fafbc19e250 type=0 > > name=/rhev/data-center/mnt/blockSD/7c4f09b6-9e87-436f-bda9-22d1f0b50955/i > > mages/f5d6e074-dfe9-462d-8cfd-3e14b0eb5aea/766e36b2-84a6-43e7-a48b-a5f47e6 > > 69860 nparams=0 params=(nil) flags=0x0 2021-10-12 11:43:25.149+0000: > > 2006427: debug : virDomainLockManagerAddImage:90 : Add disk > > /dev/mapper/3600a09803830447a4f244c4657616f6f 2021-10-12 > > 11:43:25.149+0000: 2006427: debug : virLockManagerAddResource:326 : > > lock=0x7fafbc19e250 type=0 > > name=/dev/mapper/3600a09803830447a4f244c4657616f6f nparams=0 params=(nil) > > flags=0x2 2021-10-12 11:43:25.149+0000: 2006427: debug : > > virLockManagerRelease:359 : lock=0x7fafbc19e250 state=0x7fb02401d768 > > flags=0x0 2021-10-12 11:43:25.149+0000: 2006427: debug : > > virLockManagerFree:381 : lock=0x7fafbc19e250 2021-10-12 > > 11:43:25.149+0000: 2006427: debug : qemuProcessHandleIOError:920 : > > Preserving lock state '<null>' 2021-10-12 11:43:25.150+0000: 2006427: > > info : virObjectUnref:380 : OBJECT_UNREF: obj=0x7fafd0130020 2021-10-12 > > 11:43:25.150+0000: 2006427: info : virObjectUnref:380 : OBJECT_UNREF: > > obj=0x7fb02006a020 2021-10-12 11:43:25.150+0000: 2000208: info : > > virObjectRef:402 : OBJECT_REF: obj=0x7fafd010d340 2021-10-12 > > 11:43:25.150+0000: 2000208: info : vir_object_finalize:321 : > > OBJECT_DISPOSE: obj=0x7fb020082590 2021-10-12 11:43:25.150+0000: 2006427: > > info : virObjectRef:402 : OBJECT_REF: obj=0x7fb02006a020 2021-10-12 > > 11:43:25.150+0000: 2000208: info : virObjectUnref:380 : OBJECT_UNREF: > > obj=0x7fb020082590 2021-10-12 11:43:25.150+0000: 2006427: info : > > virObjectUnref:380 : OBJECT_UNREF: obj=0x7fb02006a020 2021-10-12 > > 11:43:25.150+0000: 2000208: info : virObjectNew:258 : OBJECT_NEW: > > obj=0x564a89a4cc60 classname=virDomain 2021-10-12 11:43:25.150+0000: > > 2006427: info : virObjectUnref:380 : OBJECT_UNREF: obj=0x7fb02006a020 > > 2021-10-12 11:43:25.150+0000: 2000208: info : virObjectRef:402 : > > OBJECT_REF: obj=0x7fafd0018ca0 2021-10-12 11:43:25.150+0000: 2000208: > > info : virObjectRef:402 : OBJECT_REF: obj=0x564a89978df0 2021-10-12 > > 11:43:25.150+0000: 2000208: debug : virAccessManagerCheckDomain:238 : > > manager=0x564a89978df0(name=stack) driver=QEMU domain=0x7ffd2c677010 > > perm=0 2021-10-12 11:43:25.150+0000: 2000208: debug : > > virAccessManagerCheckDomain:238 : manager=0x564a89978e50(name=none) > > driver=QEMU domain=0x7ffd2c677010 perm=0 2021-10-12 11:43:25.150+0000: > > 2000208: info : virObjectUnref:380 : OBJECT_UNREF: obj=0x564a89978df0 > > 2021-10-12 11:43:25.150+0000: 2000208: debug : > > remoteRelayDomainEventIOErrorReason:529 : Relaying domain io error > > Windows-2016-2 1 /dev/mapper/3600a09803830447a4f244c4657616f6f > > ua-26b4975e-e1d4-4e27-b2c6-2ea0894a571b 1 , callback 3 > Unfortunately, you cut off the log a bit early/late; there should be a > message from qemu's monitor mentioning BLOCK_IO_ERROR; Can you check > your logs please? yes, it's few lines earlier: 2021-10-12 11:43:22.297+0000: 2000213: info : virObjectUnref:380 : OBJECT_UNREF: obj=0x564a89a1d010 2021-10-12 11:43:22.297+0000: 2000213: info : virObjectUnref:380 : OBJECT_UNREF: obj=0x564a89a46360 2021-10-12 11:43:22.297+0000: 2000208: debug : virNetMessageFree:85 : msg=0x564a89a3d510 nfds=0 cb=(nil) 2021-10-12 11:43:22.297+0000: 2000208: debug : virNetServerClientCalculateHandleMode:154 : tls=(nil) hs=-1, rx=0x564a89a48620 tx=(nil) 2021-10-12 11:43:22.297+0000: 2000208: debug : virNetServerClientCalculateHandleMode:185 : mode=01 2021-10-12 11:43:25.148+0000: 2006427: info : virObjectRef:402 : OBJECT_REF: obj=0x7fb02006a020 2021-10-12 11:43:25.148+0000: 2006427: debug : qemuMonitorEmitEvent:1166 : mon=0x7fb02006a020 event=BLOCK_IO_ERROR 2021-10-12 11:43:25.148+0000: 2006427: info : virObjectRef:402 : OBJECT_REF: obj=0x7fb02006a020 2021-10-12 11:43:25.148+0000: 2006427: debug : qemuProcessHandleEvent:574 : vm=0x7fb01400f5e0 2021-10-12 11:43:25.148+0000: 2006427: info : virObjectNew:258 : OBJECT_NEW: obj=0x7fb020082500 classname=virDomainQemuMonitorEvent 2021-10-12 11:43:25.148+0000: 2006427: info : virObjectUnref:380 : OBJECT_UNREF: obj=0x7fb02006a020 2021-10-12 11:43:25.148+0000: 2006427: debug : qemuMonitorEmitIOError:1243 : mon=0x7fb02006a020 However, it seem the issue was caused by using error_policy='stop' in <driver name='qemu' type='raw' cache='none' error_policy='stop' io='native'/> I need to double check, but switching to error_policy='report' seems to fix the issue and HA validator succeeds (probably one of the test is testing concurrent reservations and validating that one reservation fails?). Thanks Vojta
Attachment:
signature.asc
Description: This is a digitally signed message part.