Thanks Bjoern From: Peter Krempa <pkrempa@xxxxxxxxxx> Date: Tuesday, April 19, 2022 at 6:20 AM To: Bjoern Teipel <bjoern.teipel@xxxxxxxxxxxxx> Cc: libvirt-users@xxxxxxxxxx <libvirt-users@xxxxxxxxxx> Subject: Re: Virtio-scsi and block mirroring CAUTION: This message originated externally, please use caution when clicki= ng on links or opening attachments! On Thu, Apr 14, 2022 at 16:36:38 +0000, Bjoern Teipel wrote: > Hello everyone, Hi, > > I=92m looking at an issue where I do see guests freezing (Dl) process sta= te during a block disk mirror from one storage to another storage (NFS) whe= re the network stack of the guest can freeze for up to 10 seconds. > Looking at the storage and IO I noticed good throughput ad low latency <3= ms and I am having trouble to track down the source for the issue, as neith= er storage nor networking show issues. Interestingly when I do the same te= st with virtio-blk I do not really see the process freezes at the frequency= or duration compared to virtio-scsi which seem to indicate a client side r= ather than storage side problem. Hmm, this is really weird if the difference is in the guest-facing device frontend. Since libvirt is merely setting up the block job for the copy and the copy itself is handled by qemu I suggest you contact the qemu-block@xxxxxxxxxx mailing list. Unfortunately you didn't provide any information on the disk configuration (the VM XML) or how you start the blockjob, which I could translate for you into qemu specifics. If you provide such information I can do that to ensure that the qemu folks have all the relevant information. --_000_SA0PR20MB3502A1C05384B81CA52B0F01FBF29SA0PR20MB3502namp_ Content-Type: text/html; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable <html xmlns:o=3D"urn:schemas-microsoft-com:office:office" xmlns:w=3D"urn:sc= hemas-microsoft-com:office:word" xmlns:m=3D"http://schemas.microsoft.com/of= fice/2004/12/omml" xmlns=3D"http://www.w3.org/TR/REC-html40"> <head> <meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3DWindows-1= 252"> <meta name=3D"Generator" content=3D"Microsoft Word 15 (filtered medium)"> <style><!-- /* Font Definitions */ @font-face =09{font-family:"Cambria Math"; =09panose-1:2 4 5 3 5 4 6 3 2 4;} @font-face =09{font-family:Calibri; =09panose-1:2 15 5 2 2 2 4 3 2 4;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal =09{margin:0in; =09font-size:11.0pt; =09font-family:"Calibri",sans-serif;} .MsoChpDefault =09{mso-style-type:export-only; =09font-size:10.0pt;} @page WordSection1 =09{size:8.5in 11.0in; =09margin:1.0in 1.0in 1.0in 1.0in;} div.WordSection1 =09{page:WordSection1;} --></style> </head> <body lang=3D"EN-US" link=3D"#0563C1" vlink=3D"#954F72" style=3D"word-wrap:= break-word"> <div class=3D"WordSection1"> <p class=3D"MsoNormal">Thanks Peter, that=92s what I figured that the actua= l copy is done by the qemu process.<o:p></o:p></p> <p class=3D"MsoNormal">The copy job is setup by openstack volume migration = and translate into<o:p></o:p></p> <p class=3D"MsoNormal"><o:p> </o:p></p> <p class=3D"MsoNormal"><span style=3D"font-family:"Courier New"">= <mirror type=3D'file' file=3D'/var/lib/no= va/mnt/xxx' format=3D'raw' job=3D'copy'><o:p></o:p></span></p> <p class=3D"MsoNormal"><span style=3D"font-family:"Courier New"">= <format type=3D'raw'/><o:p= ></o:p></span></p> <p class=3D"MsoNormal"><span style=3D"font-family:"Courier New"">= <source file=3D'/var/lib/nova= /mnt/yyy' index=3D'4'/><o:p></o:p></span></p> <p class=3D"MsoNormal"><span style=3D"font-family:"Courier New"">= <backingStore/><o:p></o:p>= </span></p> <p class=3D"MsoNormal"><span style=3D"font-family:"Courier New"">= </mirror><o:p></o:p></span></p> <p class=3D"MsoNormal"><o:p> </o:p></p> <p class=3D"MsoNormal">From what I observed the issue is more noticeable wh= en I see more fdatasync calls during the copy but I haven=92t been able to = correlate that to the issue 100% yet<o:p></o:p></p> <p class=3D"MsoNormal"><o:p> </o:p></p> <p class=3D"MsoNormal">Thanks<o:p></o:p></p> <p class=3D"MsoNormal">Bjoern<o:p></o:p></p> <div style=3D"border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in = 0in 0in"> <p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt"><b><span style=3D"fon= t-size:12.0pt;color:black">From: </span></b><span style=3D"font-size:12.0pt;color:black">Peter Krempa <pk= rempa@xxxxxxxxxx><br> <b>Date: </b>Tuesday, April 19, 2022 at 6:20 AM<br> <b>To: </b>Bjoern Teipel <bjoern.teipel@xxxxxxxxxxxxx><br> <b>Cc: </b>libvirt-users@xxxxxxxxxx <libvirt-users@xxxxxxxxxx><br> <b>Subject: </b>Re: Virtio-scsi and block mirroring<o:p></o:p></span></p> </div> <div> <p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt">CAUTION: This message= originated externally, please use caution when clicking on links or openin= g attachments!<br> <br> <br> On Thu, Apr 14, 2022 at 16:36:38 +0000, Bjoern Teipel wrote:<br> > Hello everyone,<br> <br> Hi,<br> <br> ><br> > I=92m looking at an issue where I do see guests freezing (Dl) process = state during a block disk mirror from one storage to another storage (NFS) = where the network stack of the guest can freeze for up to 10 seconds.<br> > Looking at the storage and IO I noticed good throughput ad low latency= <3ms and I am having trouble to track down the source for the issue, as= neither storage nor networking show issues. Interestingly when I do = the same test with virtio-blk I do not really see the process freezes at the frequency or duration compared to virtio-sc= si which seem to indicate a client side rather than storage side problem.<b= r> <br> Hmm, this is really weird if the difference is in the guest-facing<br> device frontend.<br> <br> Since libvirt is merely setting up the block job for the copy and the<br> copy itself is handled by qemu I suggest you contact the<br> qemu-block@xxxxxxxxxx mailing list.<br> <br> Unfortunately you didn't provide any information on the disk<br> configuration (the VM XML) or how you start the blockjob, which I could<br> translate for you into qemu specifics. If you provide such information I<br= > can do that to ensure that the qemu folks have all the relevant<br> information.<o:p></o:p></p> </div> </div> </body> </html> --_000_SA0PR20MB3502A1C05384B81CA52B0F01FBF29SA0PR20MB3502namp_--