Re: [PULL 1/1] vfio-ccw: Don't assume there are more ccws after a TIC

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 19 Feb 2019 21:49:07 -0500
Eric Farman <farman@xxxxxxxxxxxxx> wrote:

> Hi Connie, Farhan,
> 
> On 02/04/2019 12:06 PM, Cornelia Huck wrote:
> > From: Farhan Ali <alifm@xxxxxxxxxxxxx>
> > 
> > When trying to calculate the length of a ccw chain, we assume
> > there are ccws after a TIC. This can lead to overcounting and
> > copying garbage data from guest memory.
> > 
> > Signed-off-by: Farhan Ali <alifm@xxxxxxxxxxxxx>
> > Message-Id: <d63748c1f1b03147bcbf401596638627a5e35ef7.1548082107.git.alifm@xxxxxxxxxxxxx>
> > Reviewed-by: Halil Pasic <pasic@xxxxxxxxxxxxx>
> > Signed-off-by: Cornelia Huck <cohuck@xxxxxxxxxx>
> > ---
> >   drivers/s390/cio/vfio_ccw_cp.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> > index 70a006ba4d05..ba08fe137c2e 100644
> > --- a/drivers/s390/cio/vfio_ccw_cp.c
> > +++ b/drivers/s390/cio/vfio_ccw_cp.c
> > @@ -392,7 +392,7 @@ static int ccwchain_calc_length(u64 iova, struct channel_program *cp)
> >   			return -EOPNOTSUPP;
> >   		}
> >   
> > -		if ((!ccw_is_chain(ccw)) && (!ccw_is_tic(ccw)))
> > +		if (!ccw_is_chain(ccw))
> >   			break;
> >   
> >   		ccw++;
> >   
> 
> Sigh.  Let me state up front that I'm running vanilla 5.0-rc7 host 
> kernel with this patch applied (or not), and QEMU from a week or two ago.
> 
> I stated in another thread that this patch makes things better when 
> running fio and other random exercisers within my guest.  But I was 
> always booting off a virtio-blk disk when I did that, and using some 
> additional disks connected by vfio-ccw.  Today I discovered that this 
> patch prevents guest boot from vfio-ccw (which was working previously):
> 
> /usr/local/bin/qemu-system-s390x -machine s390-ccw-virtio,accel=kvm -smp 
> 4 -m 1024 -nographic -s -net none -device 
> vfio-ccw,sysfsdev=/sys/class/mdev_bus/0.0.0cbb/9503d38c-b83a-45b1-a2f0-393739817786,devno=fe.0.0e1e,force-orb-pfch=true,bootindex=0 
> -bios /usr/src/qemu/build/pc-bios/s390-ccw/s390-ccw.img
> LOADPARM=[        ]
> vfio-ccw device I/O error - Interrupt Response Block Data:
>      Function Ctrl : [Start]
>      Activity Ctrl :
>      Status Ctrl : [Alert] [Primary] [Secondary] [Status-Pending]
>      Device Status : [Channel-End] [Device-End]
>      Channel Status : [Incorrect-Length]
>      cpa=: 0x0000000000000018
>      prev_ccw=: 0x3100003840000008
>      this_ccw=: 0x0800001000000000
> Eckd Dasd Sense Data (fmt 32-bytes):
>      Sense Condition Flags :
>      Residual Count     =: 0x0000000000000000
>      Phys Drive ID      =: 0x0000000000000000
>      low cyl address    =: 0x0000000000000000
>      head addr & hi cyl =: 0x0000000000000000
>      format/message     =: 0x0000000000000000
>      fmt-dependent[0-7] =: 0x0000000000000000
>      fmt-dependent[8-15]=: 0x0000000005160f00
>      prog action code   =: 0x0000000000000000
>      Configuration info =: 0x00000000000040e0
>      mcode / hi-cyl     =: 0x0000000000000000
>      cyl & head addr [0]=: 0x0000000000000000
>      cyl & head addr [1]=: 0x0000000000000000
>      cyl & head addr [2]=: 0x0000000000000000
> ...snip...
> 
> Reverting this patch allows the boot to proceed normally.  So, ugh.

Agreed on the 'ugh' :(

> 
> I'm not going to claim to know exactly how this is failing, because it's 
> too late for coffee and reading IPL records is difficult enough.  But in 
> the working case (without this patch), cp_init() calls 
> ccwchain_calc_length() and gets a count of four CCWs (SEEK + SIDE + TIC 
> + READ), which is broken up into two chains.  One for the SEEK/SIDE/TIC, 
> and another for the READ.  In the failing case (with this patch), 
> cp_init() only counts three CCWs, we never read the IPL2 address (the 
> second chain in the working case), and our boot fails.
> 
> Unsure what the next best move is here.  It is possible that the 
> (ill-used) check that Farhan noticed and removed was actually added to 
> "make boot work" when handling recursive TIC CCWs such as this.  But I 
> still agree that this patch is correct, even though it exposes more 
> problems with our TIC handling throughout this code at large.  Any thoughts?

Yes, there's probably something rotten in our TIC handling...

Just to make things clear: This patch makes doing I/O on a booted
system more stable, but makes the not-yet-merged QEMU bios boot code
fail, right?

I'll guess I'll stare at the ccw translation code for a bit and see if
something jumps out at me...



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux