Re: [PULL 1/1] vfio-ccw: Don't assume there are more ccws after a TIC

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 19 Feb 2019 21:49:07 -0500
Eric Farman <farman@xxxxxxxxxxxxx> wrote:

> Hi Connie, Farhan,
> 
> On 02/04/2019 12:06 PM, Cornelia Huck wrote:
> > From: Farhan Ali <alifm@xxxxxxxxxxxxx>
> > 
> > When trying to calculate the length of a ccw chain, we assume
> > there are ccws after a TIC. This can lead to overcounting and
> > copying garbage data from guest memory.
> > 
> > Signed-off-by: Farhan Ali <alifm@xxxxxxxxxxxxx>
> > Message-Id: <d63748c1f1b03147bcbf401596638627a5e35ef7.1548082107.git.alifm@xxxxxxxxxxxxx>
> > Reviewed-by: Halil Pasic <pasic@xxxxxxxxxxxxx>
> > Signed-off-by: Cornelia Huck <cohuck@xxxxxxxxxx>
> > ---
> >   drivers/s390/cio/vfio_ccw_cp.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
> > index 70a006ba4d05..ba08fe137c2e 100644
> > --- a/drivers/s390/cio/vfio_ccw_cp.c
> > +++ b/drivers/s390/cio/vfio_ccw_cp.c
> > @@ -392,7 +392,7 @@ static int ccwchain_calc_length(u64 iova, struct channel_program *cp)
> >   			return -EOPNOTSUPP;
> >   		}
> >   
> > -		if ((!ccw_is_chain(ccw)) && (!ccw_is_tic(ccw)))
> > +		if (!ccw_is_chain(ccw))
> >   			break;
> >   
> >   		ccw++;
> > 
> 
> Sigh.  Let me state up front that I'm running vanilla 5.0-rc7 host 
> kernel with this patch applied (or not), and QEMU from a week or two ago.
> 
> I stated in another thread that this patch makes things better when 
> running fio and other random exercisers within my guest.  But I was 
> always booting off a virtio-blk disk when I did that, and using some 
> additional disks connected by vfio-ccw.  Today I discovered that this 
> patch prevents guest boot from vfio-ccw (which was working previously):
> 
> /usr/local/bin/qemu-system-s390x -machine s390-ccw-virtio,accel=kvm -smp 
> 4 -m 1024 -nographic -s -net none -device 
> vfio-ccw,sysfsdev=/sys/class/mdev_bus/0.0.0cbb/9503d38c-b83a-45b1-a2f0-393739817786,devno=fe.0.0e1e,force-orb-pfch=true,bootindex=0 
> -bios /usr/src/qemu/build/pc-bios/s390-ccw/s390-ccw.img
> LOADPARM=[        ]
> vfio-ccw device I/O error - Interrupt Response Block Data:
>      Function Ctrl : [Start]
>      Activity Ctrl :
>      Status Ctrl : [Alert] [Primary] [Secondary] [Status-Pending]
>      Device Status : [Channel-End] [Device-End]
>      Channel Status : [Incorrect-Length]
>      cpa=: 0x0000000000000018
>      prev_ccw=: 0x3100003840000008
>      this_ccw=: 0x0800001000000000
> Eckd Dasd Sense Data (fmt 32-bytes):
>      Sense Condition Flags :
>      Residual Count     =: 0x0000000000000000
>      Phys Drive ID      =: 0x0000000000000000
>      low cyl address    =: 0x0000000000000000
>      head addr & hi cyl =: 0x0000000000000000
>      format/message     =: 0x0000000000000000
>      fmt-dependent[0-7] =: 0x0000000000000000
>      fmt-dependent[8-15]=: 0x0000000005160f00
>      prog action code   =: 0x0000000000000000
>      Configuration info =: 0x00000000000040e0
>      mcode / hi-cyl     =: 0x0000000000000000
>      cyl & head addr [0]=: 0x0000000000000000
>      cyl & head addr [1]=: 0x0000000000000000
>      cyl & head addr [2]=: 0x0000000000000000
> ...snip...
> 
> Reverting this patch allows the boot to proceed normally.  So, ugh.
> 
> I'm not going to claim to know exactly how this is failing, because it's 
> too late for coffee and reading IPL records is difficult enough.  But in 
> the working case (without this patch), cp_init() calls 
> ccwchain_calc_length() and gets a count of four CCWs (SEEK + SIDE + TIC 
> + READ), which is broken up into two chains.  One for the SEEK/SIDE/TIC, 
> and another for the READ.  In the failing case (with this patch), 
> cp_init() only counts three CCWs, we never read the IPL2 address (the 
> second chain in the working case), and our boot fails.
> 
> Unsure what the next best move is here.  It is possible that the 
> (ill-used) check that Farhan noticed and removed was actually added to 
> "make boot work" when handling recursive TIC CCWs such as this.  But I 
> still agree that this patch is correct, even though it exposes more 
> problems with our TIC handling throughout this code at large.  Any thoughts?
> 

With the annotation cp[] := {SEEK, SIDE, TIC, READ};

Do we jump over cp[2] (TIC) because we get presented with a status
modifier when executing cp[1] (SIDE)? Where does the cp[2] (TIC) point
to? Does it point to cp[3] (READ)? There should be a new chain
originating from the TIC in both cases.

As I stated previously, there is usally no way to tell where is the end
of an channel program: we can get status modifier and then we have to
jump over the next ccw (which may look like the last one if this jump
on status modifier is ignored). That's why I proposed poisoning what we
suspect is the after last ccw of the channel program.

Does that make sense?

Regards,
Halil




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux