Hi Connie, Farhan,
On 02/04/2019 12:06 PM, Cornelia Huck wrote:
From: Farhan Ali <alifm@xxxxxxxxxxxxx>
When trying to calculate the length of a ccw chain, we assume
there are ccws after a TIC. This can lead to overcounting and
copying garbage data from guest memory.
Signed-off-by: Farhan Ali <alifm@xxxxxxxxxxxxx>
Message-Id: <d63748c1f1b03147bcbf401596638627a5e35ef7.1548082107.git.alifm@xxxxxxxxxxxxx>
Reviewed-by: Halil Pasic <pasic@xxxxxxxxxxxxx>
Signed-off-by: Cornelia Huck <cohuck@xxxxxxxxxx>
---
drivers/s390/cio/vfio_ccw_cp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index 70a006ba4d05..ba08fe137c2e 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -392,7 +392,7 @@ static int ccwchain_calc_length(u64 iova, struct channel_program *cp)
return -EOPNOTSUPP;
}
- if ((!ccw_is_chain(ccw)) && (!ccw_is_tic(ccw)))
+ if (!ccw_is_chain(ccw))
break;
ccw++;
Sigh. Let me state up front that I'm running vanilla 5.0-rc7 host
kernel with this patch applied (or not), and QEMU from a week or two ago.
I stated in another thread that this patch makes things better when
running fio and other random exercisers within my guest. But I was
always booting off a virtio-blk disk when I did that, and using some
additional disks connected by vfio-ccw. Today I discovered that this
patch prevents guest boot from vfio-ccw (which was working previously):
/usr/local/bin/qemu-system-s390x -machine s390-ccw-virtio,accel=kvm -smp
4 -m 1024 -nographic -s -net none -device
vfio-ccw,sysfsdev=/sys/class/mdev_bus/0.0.0cbb/9503d38c-b83a-45b1-a2f0-393739817786,devno=fe.0.0e1e,force-orb-pfch=true,bootindex=0
-bios /usr/src/qemu/build/pc-bios/s390-ccw/s390-ccw.img
LOADPARM=[ ]
vfio-ccw device I/O error - Interrupt Response Block Data:
Function Ctrl : [Start]
Activity Ctrl :
Status Ctrl : [Alert] [Primary] [Secondary] [Status-Pending]
Device Status : [Channel-End] [Device-End]
Channel Status : [Incorrect-Length]
cpa=: 0x0000000000000018
prev_ccw=: 0x3100003840000008
this_ccw=: 0x0800001000000000
Eckd Dasd Sense Data (fmt 32-bytes):
Sense Condition Flags :
Residual Count =: 0x0000000000000000
Phys Drive ID =: 0x0000000000000000
low cyl address =: 0x0000000000000000
head addr & hi cyl =: 0x0000000000000000
format/message =: 0x0000000000000000
fmt-dependent[0-7] =: 0x0000000000000000
fmt-dependent[8-15]=: 0x0000000005160f00
prog action code =: 0x0000000000000000
Configuration info =: 0x00000000000040e0
mcode / hi-cyl =: 0x0000000000000000
cyl & head addr [0]=: 0x0000000000000000
cyl & head addr [1]=: 0x0000000000000000
cyl & head addr [2]=: 0x0000000000000000
...snip...
Reverting this patch allows the boot to proceed normally. So, ugh.
I'm not going to claim to know exactly how this is failing, because it's
too late for coffee and reading IPL records is difficult enough. But in
the working case (without this patch), cp_init() calls
ccwchain_calc_length() and gets a count of four CCWs (SEEK + SIDE + TIC
+ READ), which is broken up into two chains. One for the SEEK/SIDE/TIC,
and another for the READ. In the failing case (with this patch),
cp_init() only counts three CCWs, we never read the IPL2 address (the
second chain in the working case), and our boot fails.
Unsure what the next best move is here. It is possible that the
(ill-used) check that Farhan noticed and removed was actually added to
"make boot work" when handling recursive TIC CCWs such as this. But I
still agree that this patch is correct, even though it exposes more
problems with our TIC handling throughout this code at large. Any thoughts?
- Eric