Re: [PULL 1/1] vfio-ccw: Don't assume there are more ccws after a TIC

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 02/20/2019 07:44 AM, Cornelia Huck wrote:
On Wed, 20 Feb 2019 06:29:38 -0500
Eric Farman <farman@xxxxxxxxxxxxx> wrote:

On 02/20/2019 04:48 AM, Cornelia Huck wrote:
On Tue, 19 Feb 2019 21:49:07 -0500
Eric Farman <farman@xxxxxxxxxxxxx> wrote:
Hi Connie, Farhan,

On 02/04/2019 12:06 PM, Cornelia Huck wrote:
From: Farhan Ali <alifm@xxxxxxxxxxxxx>

When trying to calculate the length of a ccw chain, we assume
there are ccws after a TIC. This can lead to overcounting and
copying garbage data from guest memory.

Signed-off-by: Farhan Ali <alifm@xxxxxxxxxxxxx>
Message-Id: <d63748c1f1b03147bcbf401596638627a5e35ef7.1548082107.git.alifm@xxxxxxxxxxxxx>
Reviewed-by: Halil Pasic <pasic@xxxxxxxxxxxxx>
Signed-off-by: Cornelia Huck <cohuck@xxxxxxxxxx>
---
    drivers/s390/cio/vfio_ccw_cp.c | 2 +-
    1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index 70a006ba4d05..ba08fe137c2e 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -392,7 +392,7 @@ static int ccwchain_calc_length(u64 iova, struct channel_program *cp)
    			return -EOPNOTSUPP;
    		}
- if ((!ccw_is_chain(ccw)) && (!ccw_is_tic(ccw)))
+		if (!ccw_is_chain(ccw))

OK, this function now returns the length of the chain excluding the
last tic.

    			break;
ccw++;

Now, cp_init will not copy the last tic to the chain. When it then
looks for tics in that new chain, it won't find any, and stop copying.

Sigh.  Let me state up front that I'm running vanilla 5.0-rc7 host
kernel with this patch applied (or not), and QEMU from a week or
two ago.

I stated in another thread that this patch makes things better when
running fio and other random exercisers within my guest.  But I was
always booting off a virtio-blk disk when I did that, and using
some additional disks connected by vfio-ccw.  Today I discovered
that this patch prevents guest boot from vfio-ccw (which was
working previously):

/usr/local/bin/qemu-system-s390x -machine
s390-ccw-virtio,accel=kvm -smp 4 -m 1024 -nographic -s -net none
-device
vfio-ccw,sysfsdev=/sys/class/mdev_bus/0.0.0cbb/9503d38c-b83a-45b1-a2f0-393739817786,devno=fe.0.0e1e,force-orb-pfch=true,bootindex=0
-bios /usr/src/qemu/build/pc-bios/s390-ccw/s390-ccw.img
LOADPARM=[        ] vfio-ccw device I/O error - Interrupt Response
Block Data: Function Ctrl : [Start]
       Activity Ctrl :
       Status Ctrl : [Alert] [Primary] [Secondary] [Status-Pending]
       Device Status : [Channel-End] [Device-End]
       Channel Status : [Incorrect-Length]
       cpa=: 0x0000000000000018
       prev_ccw=: 0x3100003840000008
       this_ccw=: 0x0800001000000000
Eckd Dasd Sense Data (fmt 32-bytes):
       Sense Condition Flags :
       Residual Count     =: 0x0000000000000000
       Phys Drive ID      =: 0x0000000000000000
       low cyl address    =: 0x0000000000000000
       head addr & hi cyl =: 0x0000000000000000
       format/message     =: 0x0000000000000000
       fmt-dependent[0-7] =: 0x0000000000000000
       fmt-dependent[8-15]=: 0x0000000005160f00
       prog action code   =: 0x0000000000000000
       Configuration info =: 0x00000000000040e0
       mcode / hi-cyl     =: 0x0000000000000000
       cyl & head addr [0]=: 0x0000000000000000
       cyl & head addr [1]=: 0x0000000000000000
       cyl & head addr [2]=: 0x0000000000000000
...snip...

Reverting this patch allows the boot to proceed normally.  So,
ugh.

Agreed on the 'ugh' :(

I'm not going to claim to know exactly how this is failing,
because it's too late for coffee and reading IPL records is
difficult enough.  But in the working case (without this patch),
cp_init() calls ccwchain_calc_length() and gets a count of four
CCWs (SEEK + SIDE + TIC
+ READ), which is broken up into two chains.  One for the
SEEK/SIDE/TIC, and another for the READ.  In the failing case
(with this patch), cp_init() only counts three CCWs, we never read
the IPL2 address (the second chain in the working case), and our
boot fails.

I think "we don't copy the tic to the chain where we search for tics"
fits those symptoms.

Indeed.  Hrm...



Unsure what the next best move is here.  It is possible that the
(ill-used) check that Farhan noticed and removed was actually
added to "make boot work" when handling recursive TIC CCWs such as
this.  But I still agree that this patch is correct, even though
it exposes more problems with our TIC handling throughout this
code at large.  Any thoughts?

Yes, there's probably something rotten in our TIC handling...

Just to make things clear: This patch makes doing I/O on a booted
system more stable, but makes the not-yet-merged QEMU bios boot code
fail, right?

Correct.

(I think I deleted that while editing.  Sorry for the confusion.)

np, I thought as much :)



I'll guess I'll stare at the ccw translation code for a bit and see
if something jumps out at me...

See above. We may need two chains: One without the trailing tic, and
one to process to see where that tic points to... or rework the way to
follow the tic? Not sure.

I hate the idea of two chains, because how much storage do we end up needing to consume? But maybe it's inevitable, or maybe I can think up a way to rework it without.



I'll ping Jason a bit later today, and see if anything jumps out at
us before Farhan returns.

I'm wondering whether we should keep this patch and fix on top of it,
or revert it for now... I'm not sure tic processing is working at all
right now.

I've been wondering this too. Yeah, the bios code isn't merged yet, but this patch means there's no point in merging it until we get TIC fixed properly. Which I'd hope is 5.1, but I'm not sure how deep this bowl of spaghetti is.


Maybe we need a tool for testing that throws random channel programs at
the device :)


Are you peeking at my todo/wish list?  :)




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux