On 5/25/23 2:57 AM, Michael S. Tsirkin wrote: > On Wed, May 24, 2023 at 06:34:05PM -0500, Mike Christie wrote: >> The linux block layer requires bios/requests to have lengths with a 512 >> byte alignment. Some drivers/layers like dm-crypt and the directi IO code >> will test for it and just fail. Other drivers like SCSI just assume the >> requirement is met and will end up in infinte retry loops. The problem >> for drivers like SCSI is that it uses functions like blk_rq_cur_sectors >> and blk_rq_sectors which divide the request's length by 512. If there's >> lefovers then it just gets dropped. But other code in the block/scsi >> layer may use blk_rq_bytes/blk_rq_cur_bytes and end up thinking there is >> still data left and try to retry the cmd. We can then end up getting >> stuck in retry loops where part of the block/scsi thinks there is data >> left, but other parts think we want to do IOs of zero length. >> >> Linux will always check for alignment, but windows will not. When >> vhost-scsi then translates the iovec it gets from a windows guest to a >> scatterlist, we can end up with sg items where the sg->length is not >> divisible by 512 due to the misaligned offset: >> >> sg[0].offset = 255; >> sg[0].length = 3841; >> sg... >> sg[N].offset = 0; >> sg[N].length = 255; >> >> When the lio backends then convert the SG to bios or other iovecs, we >> end up sending them with the same misaligned values and can hit the >> issues above. >> >> This just has us drop down to allocating a temp page and copying the data >> when this happens. Because this adds a check that needs to loop over the >> iovec in the main IO path, this patch adds an attribute that can be turned >> on for just windows. >> >> Signed-off-by: Mike Christie <michael.christie@xxxxxxxxxx> > > I am assuming this triggers rarely, yes? > > If so I would like to find a way to avoid setting an attribute. > > I am guessing the main cost is an extra scan of iov. The scan and a memory allocation to dup the iter. However, I see iov_iter_revert and I think that might be what I needed before to avoid the mem alloc so will try it out. > vhost_scsi_iov_to_sgl already does a scan, how about changing > it to fail if misaligned? > We can then detect the relevant error code and try with a copy. >